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n ^r vAT. g rm.F. S CP ™" T&wnr>-77 RELATED PROTEIN 
t!^^- wamtTjY AND THEREOF 

p^ Hmnnd Invention 
The polypeptide cytokine interleukin-l (IL-l) 
5 is a critical mediator of inflammatory and overall immune 
response. To date, three members of the IL-l family. 
IL-1«, IL-l* ^d IL-ira (Interleukin-l receptor 
antagonist) have been isolated and cloned. IL-1« and 
IL-10 are proinflammatory cytokines which elicit 
l0 biological responses, whereas IL-lra is an antagonist of 
IL-l« and IL-10 activity. Two distinct cell-surface 
receptors have been identified for these ligands, the 
type 1 IL-l receptor (IL-lRtl) and type II IL-l receptor 
(IL-lRtII) . Recent results suggest that the IL-lRtl is 
15 the receptor responsible for transducing a signal and 
producing biological effects. 

As mentioned above, IL-l is a key mediator of the 
host inflammatory response. While inflammation is an 
important homeostatic mechanism, aberrant inflammation 
20 has the potential for inducing damage to the host 

Elevated IL-l levels are known to be associated with a 
number of diseases particularly autoimmune diseases and 

inflammatory disorders. , , .._ f 

Since Il-lra is a naturally occurring inhibitor of 

25 IL-l IL-ira can be used to limit the aberrant and 

potentially deleterious effects of IL-l. In experimental 
"animals, pretreatment with IL-lra has been shown to 
prevent death resulting from lipopolysacchande- induced 
sepsis. The relative absence of IL-lra has also been 

30 suggested to play a role in human inflammatory bowel 
disease. 

c„mm aT -v of invention 
The present invention is based, at least in part 
on the discovery of a gene encoding Tango-77, a secreted 
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protein that is predicted to be a member of the cytokine 
superfamily. The Tango-77 cDNA described below (SEQ id 
NO:l) has three possible open reading frames. The first 
potential open reading frame encompasses 534 nucleotides 
s extending from nucleotide 356 to nucleotide 889 of SEQ id 
NO:l (SEQ ID NO:3) and encodes a 178 amino acid protein 
(SEQ ID NO:2). This protein may include a predicted 
signal sequence of about 63 amino acids (from about amino 
acid 1 to about amino acid 63 of SEQ ID NO: 2 (SEQ ID 
io NO: 4) and a predicted mature protein of about 115 amino 
acids (from about amino acid 64 to amino acid 178 of SEQ 
ID NO: 2 (SEQ ID NO: 5) ) . 

The second potential open reading frame 
encompasses 498 nucleotides extending from nucleotide 389 
15 to nucleotide 889 of SEQ ID NO:l (SEQ ID NO:6) and 
encodes a 167 amino acid protein (SEQ ID NO: 7) . This 
protein may include a predicted signal sequence of about 
52 amino acids (from about amino acid 1 to about amino 
acid 52 of SEQ ID N0:7 (SEQ ID NO:8)) and a predicted 

20 mature protein of about 115 amino acids (from about amino 
acid 52 to amino acid 167 of SEQ ID NO:7 (SEQ ID NO:9)) . 

The third potential open reading frame encompasses 
408 nucleotides extending from nucleotide 481 to 
nucleotide 889 of SEQ ID NO:l (SEQ ID NO: 10) and encodes 

25 a 136 amino acid protein (SEQ ID N0:11) . This protein 
includes a predicted signal sequence of about 21 amino 
acids (from about amino acid 1 to about amino acid 21 of 
SEQ ID NO:ll (SEQ ID NO:12)) and a predicted mature 
protein of about 115 amino acids (from about amino acid 

30 22 to amino acid 136 of SEQ ID N0:11 (SEQ ID NO:13)). 

As used herein, the terms "Tango-77" , "Tango-77 
protein", "Tango-77 polypeptide" amd the like,, can refer 
and polypeptide produced by the cDNA of SEQ ID N0:1 
including any and all of the Tango-77 gene products 

35 described above. 
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Tango-77 is expected to inhibit inflammation and 
play a functional role similar to that of secreted 
IL-lra. For example, it is expected that Tango-77 may 
bind to the IL-1 receptor, thus blocking receptor 
s activation by inhibiting the binding of IL-lor and IL-1/? 
to the receptor. Alternatively, Tango-77 may inhibit 
inflammation through another pathway, for example, by 
binding to a novel receptor. Accordingly, Tango-77 may 
be useful as a modulating agent in regulating a variety 
L0 of cellular processes including acute and chronic 

inflammation, e.g., asthma, chronic myelogenous leukemia, 
rheumatoid arthritis, psoriasis and inflammatory bowel 
disease. 

in one aspect, the invention provides isolated 
15 nucleic acid molecules encoding Tango-77 or biologically 
active portions thereof, as well as nucleic acid 
fragments suitable as primers or hybridization probes for 
the detection of Tango-77. 

The invention encompasses methods of diagnosing 
20 and treating patients who are suffering from a disorder 
associated with an abnormal level (undesirably high or 
undesirably low) of inflammation, abnormal activity of 
the IL-1 receptor complex, or abnormal activity of IL-1, 
by administering a compound that modulates the expression 
25 of Tango-77 (at the DNA, mRNA or protein level, e.g., by 
altering mRNA splicing) or by altering the activity of 
Tango-77 Examples of such compounds include small 
,lecules, antisense nucleic acid molecules, ribozymes, 
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30 



and polypeptides. 

The invention features a nucleic acid molecule 
which is at least 45% (e.g., 55%, 65%, 75%, 85%.*%. or 
98%) identical to the nucleotide sequence shown in SEQ ID 
NO-1 SEQ ID N0:3, SEQ ID N0:6, SEQ ID NOslO. the 
nucleotide sequence of the cDNA insert of the plasmid 
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deposited with ATCC as Accession Number (the " cDNA of 
ATCC 98807") , or a complement thereof. 

The invention features a nucleic acid molecule 
which includes a fragment of at least 100 (e.g 250 
S 325, 350, 375, 400, 425, 450, 500, 550, 600, 650 , 700, 
800, 900, or 989) nucleotides of the nucleotide sequence 
shown in SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:6, SEQ ID 
NO: 10, the nucleotide sequence of the cDNA ATCC 98807 or 
a complement thereof. 

° The invention also features a nucleic acid 

molecule which includes a nucleotide sequence encoding a 
protean having an amino acid sequence that is at least 
45% (55%, 65%, 75%, 85%, 95%, or 98%) identical to the 
ammo acid sequence of SEQ ID NO:2, SEQ ID NO: 5, SEQ ID 
NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13, or the 
amino acid sequence encoded by the cDNA of ATCC 98807. 

In a preferred embodiment, a Tango-77 nucleic acid 
molecule has the nucleotide sequence shown in SEQ ID 
NO:l, SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:10 or the 
nucleotide sequence of the cDNA of ATCC 98807. 

Also within the invention is a nucleic acid 
molecule which encodes a fragment of a polypeptide having 
the amino acid sequence of SEQ ID NO: 2, SEQ ID NO -4 SEQ 
ID NO:5, SEQ ID NO:7, SEQ ID N0:8, SEQ ID NO:9, SEq'id 
NO:li, SEQ ID N0:12, SEQ id NO:l3, wherein the fragment 
includes at least 15 (e.g., 25, 30, 50, 100, 150, or 178) 
contiguous amino acids of SEQ ID NO:2, SEQ ID NO -4 SEQ 
ID NO:5, SEQ ID NO:7, SEQ ID N0:8, SEQ ID NO:9, SEQ ID 
NO:ll, SEQ ID HO, 12, SEQ ID NO:13, or the polypeptide 
encoded by the cDNA of ATCC Accession Number 98807. 

The invention includes a nucleic acid molecule 
which encodes a naturally occurring allelic variant of a 
polypeptide comprising the amino acid sequence of SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO-8 
SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:12, SEQ ID NO-13 or 
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an amino acid sequence encoded by the cDNA of ATCC 
Accession Number 98807, wherein the nucleic acid molecule 
hybridizes to a nucleic acid molecule comprising SEQ ID 
NO:l, SEQ ID NO : 3 , SEQ ID NO: 6, SEQ ID NO: 10, or a 
complement thereof under stringent conditions. 

Also within the invention are: an isolated 
Tango- 77 protein having an amino acid sequence that is at 
least about 45%, preferably 65%, 75%, 85%, 95%, or 98% 
identical to the amino acid sequence of SEQ ID NO: 5, SEQ 
ID NO:9 or SEQ ID NO:13 (mature human Tango- 77 ) , or the 
amino acid sequence of SEQ ID NO: 2, SEQ ID NO: 7 or SEQ ID 
NO: 11 (immature human Tango- 77 ) . 

Also within the invention are: an isolated 
Tango-77 protein which is encoded by a nucleic acid 
molecule having a nucleotide sequence that is at least 
about 65%, preferably 75%, 85%, or 95% identical to SEQ 
ID NO: 3, SEQ ID NO: 6, SEQ ID NO: 10 or the cDNA of ATCC 
98807; and an isolated Tango-77 protein which is encoded 
by a nucleic acid molecule having a nucleotide sequence 
which hybridizes under stringent hybridization conditions 
to a nucleic acid molecule having the nucleotide sequence 
of SEQ ID NO: 3, SEQ ID NO : 6 , SEQ ID NO: 10, the non- coding 
strand of the cDNA of ATCC 98807, or the complement 
thereof. 

Also within the invention is a polypeptide which 
is a naturally occurring allelic variant of a polypeptide 
that includes the amino acid sequence of SEQ ID NO: 2, SEQ 
ID NO:4, SEQ ID NO : 5 , SEQ ID NO : 7 , SEQ ID NO : 8 , SEQ ID 
NO:9, SEQ ID NO:ll, SEQ ID NO:12, SEQ ID NO:13, or an 
i amino acid sequence encoded by the cDNA insert of the 
plasmid deposited with ATCC as Accession Number 98807, 
wherein the polypeptide is encoded by a nucleic acid 
molecule which hybridizes to a nucleic acid molecule 
comprising SEQ ID NO:l, SEQ ID N0:3, SEQ ID NO: 6, SEQ ID 
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NO: 10 or the complement thereof under stringent 
conditions . 

Another embodiment of the invention features 
Tango- 77 nucleic acid molecules which specifically detect 
s Tango-77 nucleic acid molecules relative to nucleic acid 
molecules encoding other members of the cytokine 
superfamily. For example, in one embodiment, a Tango-77 
nucleic acid molecule hybridizes under stringent 
conditions to a nucleic acid molecule comprising the 

10 nucleotide sequence of SEQ ID N0:1, SEQ ID NO: 3 , SEQ ID 
N0:6, SEQ ID NO:10, the cDNA of ATCC 98807, or a 
complement thereof. In another embodiment, the Tango-77 
nucleic acid molecule is at least 300 (325, 350, 375, 
400, 425, 450, 500, 550, 600, 650, 700, 800, 900, or 989) 

is nucleotides in length and hybridizes under stringent 
conditions to a nucleic acid molecule comprising the 
nucleotide sequence shown in SEQ ID N0:1, SEQ ID NO: 3, 
SEQ ID NO: 6, SEQ ID NO: 10, the cDNA of ATCC 98807, or a 
complement thereof. In yet another embodiment, the 

20 invention provides an isolated nucleic acid molecule 
which is antisense to the coding strand of a Tango-77 
nucleic acid. 

Another aspect of the invention provides a vector, 
e.g., a recombinant expression vector, comprising a 

25 Tango-77 nucleic acid molecule of the invention. In 
another embodiment, the invention provides a host cell 
containing such a vector. The invention also provides a 
method for producing Tango-77 protein by culturing, in a 
suitable medium, a host cell of the invention containing 

3 0 a recombinant expression vector such that a Tango-77 
protein is produced. 

Another aspect of this invention features isolated 
or recombinant Tango-77 proteins and polypeptides. 
Preferred Tango-77 proteins and polypeptides possess at 

35 least one biological activity possessed by naturally 
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occurring human Tango-77, e.g., (i) the ability to 
interact with proteins in the Tango-77 signalling pathway 
(ii) the ability to interact with a Tango-77 ligand or 
receptor; or (iii) the ability to interact with an 
intracellular target protein, (iv) the ability to 
interact with a protein involved in inflammation and (v) 
the ability to bind the IL-1 receptor. Other activities 
include the induction and suppression of polypeptide 
interleukins, cytokines and growth factors, 
j The Tango-77 proteins of the present invention, or 

biologically active portions thereof, can be operably 
linked to a non-Tango-77 polypeptide (e.g., heterologous 
amino acid sequences) to form Tango-77 fusion proteins. 
The invention further features antibodies that 
5 specifically bind Tango-77 proteins, such as monoclonal 
or polyclonal antibodies. In addition, the Tango-77 
proteins or biologically active portions thereof can be 
incorporated into pharmaceutical compositions, which 
optionally include pharmaceutical^ acceptable carriers. 

, 0 in another aspect, the present invention provides 

a method for detecting the presence of Tango-77 activity 
or expression in a biological sample by contacting the 
biological sample with an agent capable of detecting an 
indicator of Tango-77 activity or expression such that 

25 the presence of Tango-77 activity or expression is 
detected in the biological sample. 

in another aspect, the invention provides a method 
for modulating Tango-77 activity comprising contacting a 
cell with an agent that modulates (inhibits or 

30 stimulates) 

Tango-77 activity or expression such that Tango-77 
activity or expression in the cell is modulated. In one 
embodiment, the agent is an antibody that specifically 
binds to Tango-77 protein. In another embodiment, the 
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agent modulates expression of Tango- 77 by modulating 
transcription of a Tango-77 gene, splicing of a Tango-77 
mRNA, or translation of a Tango-77 mRNA. m yet another 
embodiment, the agent is a nucleic acid molecule having a 
nucleotide sequence that is antisense to the coding 
strand of the Tango-77 mRNA or the Tango-77 gene. 

In one embodiment, the methods of the present 
invention are used to treat a subject having a disorder 
characterized by aberrant Tango-77 protein activity or 
nucleic acid expression by administering an agent which 
is a Tango-77 modulator to the subject. In one 
embodiment, the Tango-77 modulator is a Tango-77 protein. 
In another embodiment, the Tango-77 modulator is a 
Tango-77 nucleic acid molecule. In other embodiments, 
the Tango-77 modulator is a peptide, peptidomimetic, or 
other small molecule. In a preferred embodiment, the 
disorder characterized by aberrant Tango-77 protein or 
nucleic acid expression can include chronic and acute 
inflammation. 

The present invention also provides a diagnostic 
assay for identifying the presence or absence of a 
genetic lesion or mutation characterized by at least one 
of: (i) aberrant modification or mutation of a gene 
encoding a Tango-77 protein; (ii) mis -regulation of a 
gene encoding a Tango-77 protein; and (iii) aberrant 
post-translational modification of a Tango-77 protein, 
wherein a wild-type form of the gene encodes a protein 
with a Tango-77 activity. 

In another aspect, the invention provides a 
method for identifying a compound that binds to or 
modulates the activity of a Tango-77 protein, in 
general, such methods entail measuring a biological 
activity of a Tango-77 protein in the presence and 
absence of a test compound and identifying those 
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compounds which alter the activity of the Tango-77 
protein. 

The invention also features methods for 
identifying a compound which modulates the expression of 
Tango-77 by measuring the expression of Tango-77 in the 
presence and absence of a compound. 

Other features and advantages of the invention 
will be apparent from the following detailed description 
and claims . 

Brief Description of the Drawings 
Figure 1 depicts the cDNA sequence (SEQ ID NO-.l) 
of Tango-77. The Tango-77 cDNA has three possible open 
reading frames which encode the amino acid sequence (SEQ 
ID NO:2, SEQ ID NO:7 and SEQ ID NO: 11) of human Tango-77. 
is The three potential open reading frames of SEQ ID NO:l 
extend from: (1) nucleotide 356 to nucleotide 889 (SEQ ID 
NO:3); (2) nucleotide 389 to nucleotide 889 (SEQ ID 
NO:6); and (3) nucleotide 481 to nucleotide 889 (SEQ ID 
NO-.10) . 

Figure 2 depicts an alignment of an amino acid 
sequence of Tango-77 (T77; SEQ ID NO:2) with IL-1RA (SEQ 
ID NO: 14) , and IL-10 (SEQ ID NO: 15) . 

Figure 3 depicts the genomic sequence of BAC1 (SEQ 

ID NO: 16) . 

25 Figure 4 depicts the genomic sequence of BAC2 (SEQ 

ID NO: 17) . 

Figure 5 depicts an amino acid sequence of an 
alternatively spliced form of Tango-77 (SEQ ID NO: 2) as 
predicted by Procrustes (T77-procrustes ; SEQ ID NO:18). 

Figure 6 depicts an alignment of an amino acid 
sequence of an alternatively spliced form of Tango-77 
(T77-procrustes; SEQ ID NO: 18) with Tango-77 (SEQ ID 
NO:2) . 



20 



30 
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Figure 7 depicts an alignment of an amino acid 
sequence of an alternatively spliced form of Tango- 77 
(T77 -procrustes ; SEQ ID NO: 18) with IL-lra (SEQ ID 
NO:14), and IL-10 (SEQ ID NO:15). 

5 Detailed DeacrinMnn of fh» Tn V » r «-^ r 

The present invention is based on the discovery of 
a cDNA molecule encoding human Tango- 77, a member of the 
cytokine superfamily. The cDNA molecule encoding human 
Tango- 77 has three possible open reading frames. The 

io three possible nucleotide open reading frames for human 
Tango-77 protein are shown in Figure 1 (SEQ ID NO:3, SEQ 
ID NO:S and SEQ ID NO: 10) . The predicted amino acid 
sequence for the three possible Tango-77 immature 
proteins are also shown in 

15 Figure 1 (SEQ ID NO: 2, SEQ ID NO: 7 or SEQ ID NO: 11) and 
three possible mature proteins are also shown in Figure l 
(SEQ ID NO.-5, SEQ ID NO:9 and SEQ ID NO:13) . 

The Tango-77 cDNA of Figure 1 (SEQ ID N0:1), which 
is approximately 989 nucleotides long including 

20 untranslated regions, encodes a protein amino acid having 
a molecular weight of approximately 19 kDa, 18 kDa, or 
14.9 KDa (excluding post- trans lational modifications) and 
the possible mature form of the protein has a molecular 
weight of 13 kDa. A plasmid containing a cDNA encoding 

25 human Tango-77 (with the cDNA insert name of Of fthx077) 
was deposited with American Type Culture Collection 
(ATCC) , 10801 University Boulevard, Manassas, Virginia 
20110-2209 on July 2, 1998 and assigned Accession Number 
98807. This deposit will be maintained under the terms 

30 of the Budapest Treaty on the International Recognition 
of the Deposit of Microorganisms for the Purposes of 
Patent Procedure. This deposit was made merely as a 
convenience for those of skill in the art and is not an 
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admission that a deposit is required under 35 U.S.C. 
§112 . 

Human Tango- 77 is one member of a family of 
molecules (the "Tango-77 family") having certain 
conserved structural and functional features. The term 
"family," when referring to the protein and nucleic acid 
lecules of the invention, is intended to mean two or 
re proteins or nucleic acid molecules having a common 
structural domain and having sufficient amino acid or 
io nucleotide sequence identity as defined herein. Such 

family members can be naturally occurring and can be from 
either the same or different species. For example, a 
family can contain a first protein of human origin and a 
homologue of that protein of murine origin, as well as a 
is second, distinct protein of human origin and a murine 
homologue of that protein. Members of a family may also 
have common functional characteristics. 

As used interchangeably herein a "Tango-77 
activity", "biological activity of Tango-77" or 
20 "functional activity of Tango-77", refers to an activity 
exerted by a Tango-77 protein, polypeptide or nucleic 
acid molecule on a Tango-77 responsive cell as determined 
in vivo, or in vitro, according to standard techniques. 
A Tango-77 activity can be a direct activity, such as an 
25 association with a second protein, or an indirect 

activity, such as a cellular signaling activity mediated 
by interaction of the Tango-77 protein with a second 
protein. In a preferred embodiment, a Tango-77 activity 
includes at least one or more of the following 
30 activities: (i) the ability to interact with proteins in 
the Tango-77 signalling pathway (ii) the ability to 
interact with a Tango-77 ligand or receptor; or (iii) the 
ability to interact with an intracellular target protein, 
(iv) the ability to interact with a protein involved in 
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inflammation, and (v) the ability to bind the IL-1 
receptor. 

Accordingly, another embodiment of the invention 
features isolated Tango-77 proteins and polypeptides 
5 having a Tango-77 activity. 

Yet another embodiment of the invention features 
Tango-77 molecules which contain a signal sequence. 
Generally, a signal sequence (or signal peptide) is a 
peptide containing about 21 to 63 amino acids which 

10 occurs at the extreme N-terminal end of a secretory 
protein. The native Tango-77 signal sequence (SEQ ID 
NO: 4, SEQ ID NO: 8, or SEQ ID NO: 12) can be removed and 
replaced with a signal sequence from another protein. In 
certain host cells (e.g., mammalian host cells), 

is expression and/or secretion of Tango-77 can be increased 
through use of a heterologous signal sequence. For 
example, the gp67 secretory sequence of the baculovirus 
envelope protein can be used as a heterologous signal 
sequence. Alternatively, the native Tango-77 signal 

20 sequence can itself be used as a heterologous signal 
sequence in expression systems, e.g., to facilitate the 
secretion of a protein of interest. 

Various aspects of the invention are described in 
further detail in the following subsections. 

25 I. Isolated Nucleic Acid Molecules 

One aspect of the invention pertains to isolated 
nucleic acid molecules that encode Tango-77 proteins or 
biologically active portions thereof, as well as nucleic 
acid molecules sufficient for use as hybridization probes 

30 to identify Tango- 77 -encoding nucleic acids (e.g., 

Tango-77 mRNA) and fragments for use as PCR primers for 
the amplification or mutation of Tango-77 nucleic acid 
molecules. As used herein, the term "nucleic acid 
molecule" is intended to include DNA molecules (e.g., 
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cDNA or genomic DNA) and RNA molecules (e.g., mRNA) and 
analogs of the DNA or RNA generated using nucleotide 
analogs. The nucleic acid molecule can be single- 
stranded or double -stranded, but preferably is double - 

i stranded DNA. 

An "isolated" nucleic acid molecule is one which 
is separated from other nucleic acid molecules which are 
present in the natural source of the nucleic acid. 
Preferably, an "isolated" nucleic acid is free of 
c sequences (preferably protein encoding sequences) which 
naturally flank the nucleic acid (i.e., sequences located 
at the 5' and 3' ends of the nucleic acid) in the genomic 
DNA of the organism from which the nucleic acid is 
derived. For example, in various embodiments, the 
5 isolated Tango-77 nucleic acid molecule can contain less 
than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb 
of nucleotide sequences which naturally flank the nucleic 
acid molecule in genomic DNA of the cell from which the 
nucleic acid is derived. Moreover, an "isolated" nucleic 
2 o acid molecule, such as a cDNA molecule, can be 

substantially free of other cellular material, or culture 
medium when produced by recombinant techniques, or 
substantially free of chemical precursors or other 
chemicals when chemically synthesized. 
25 A nucleic acid molecule of the present invention, 

e g., a nucleic acid molecule having the nucleotide 
sequence of SEQ ID N0:1, SEQ ID NO:3, SEQ ID N0:6, SEQ ID 
NO: 10, the cDNA of ATCC 98807, or a complement of any of 
these nucleotide sequences, can be isolated using 
standard molecular biology techniques and the sequence 
information provided herein. Using all or a portion of 
the nucleic acid sequences of SEQ ID NO:l, SEQ ID NO: 3, 
SEQ ID NO: 6, SEQ ID NO: 10, the cDNA of ATCC 98807, or the 
complement thereof as a hybridization probe, Tango-77 
35 nucleic acid molecules can be isolated using standard 



30 
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hybridization and cloning techniques (e.g., as described 
in Sambrook et al., eds . , Molecular Cloning: A 
Laboratory Manual, 2nd ed., Cold Spring Harbor 
Laboratory, Cold Spring Harbor Laboratory Press, Cold 
s Spring Harbor, NY, 1989) . 

A nucleic acid of the invention can be amplified 
using cDNA, mRNA or genomic DNA as a template and 
appropriate oligonucleotide primers according to standard 
PCR amplification techniques. The nucleic acid so 

10 amplified can be cloned into an appropriate vector and 
characterized by DNA sequence analysis. Furthermore, 
oligonucleotides corresponding to Tango-77 nucleotide 
sequences can be prepared by standard synthetic 
techniques, e.g., using an automated DNA synthesizer. 

15 In another preferred embodiment, an isolated 

nucleic acid molecule of the invention comprises a 
nucleic acid molecule which is a complement of the 
nucleotide sequence shown in SEQ ID NO:l, SEQ ID NO: 3, 
SEQ ID NO: 6, SEQ ID NO: 10 the cDNA of ATCC 98807, or a 

20 portion thereof. A nucleic acid molecule which is 

complementary to a given nucleotide sequence is one which 
is sufficiently complementary to the given nucleotide 
sequence that it can hybridize to the given nucleotide 
sequence thereby forming a stable duplex. 

25 Moreover, the nucleic acid molecule of the 

invention can comprise only a portion of a nucleic acid 
sequence encoding Tango-77, for example, a fragment which 
can be used as a probe or primer or a fragment encoding a 
biologically active portion of Tango-77. The nucleotide 

30 sequence determined from the cloning of the human 

Tango-77 gene allows for the generation of probes and 
primers designed for use in identifying and/or cloning 
Tango-77 homologues in other cell types, e.g., from other 
tissues, as well as Tango-77 homologues from other 
35 mammals. The probe/primer typically comprises 
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substantially purified oligonucleotide. The 
oligonucleotide typically comprises a region of 
nucleotide sequence that hybridizes under stringent 
conditions to at least about 12, preferably about 25, 
more preferably about 50, 75, 100, 125, 150, 175, 200, 
250 300, 350 or 400 consecutive nucleotides of the sense 
or ant i- sense sequence of SEQ ID NO-.l, SEQ ID NO: 3, SEQ 
ID NO: 6, SEQ ID NO: 10, or the cDNA of ATCC 98807. 
Alternatively, the oligonucleotide can typically comprise 
a region of nucleotide sequence that hybridizes under 
stringent conditions to at least about 12, preferably 
about 25, more preferably about 50, 75, 100, 125, 150, 
175, 200, 250, 300, 350 or 400 consecutive nucleotides of 
the sense or anti-sense sequence of a naturally occurring 
mutant of SEQ ID N0:1, SEQ ID NO: 3, SEQ ID NO: 6, SEQ ID 
NO: 10, or the cDNA of ATCC 98807. 

Probes based on the human Tango-77 nucleotide 
sequence can be used to detect transcripts or genomic 
sequences encoding the same or identical proteins. The 
probe comprises a label group attached thereto, e.g., a 
"radioisotope, a fluorescent compound, an enzyme, or an 
enzyme co-factor. Such probes can be used as a part of a 
diagnostic test kit for identifying cells or tissues 
which mis -express a Tango-77 protein, such as by 
measuring a level of a Tango -77 -encoding nucleic acid in 
a sample of cells from a subject, e.g., detecting 
Tango-77 mRNA levels or determining whether a genomic 
Tango-77 gene has been mutated or deleted. 

A nucleic acid fragment encoding a "biologically 
active portion of Tango-77 can be prepared by isolating 
a portion of SEQ ID N0:1, SEQ ID NO:3, SEQ ID NO:6, SEQ 
ID NO: 10 or the nucleotide sequence of the cDNA of ATCC 
98807 which encodes a polypeptide having a Tango-77 
biological activity, expressing the encoded portion of 
35 Tango-77 protein (e.g., by recombinant expression in 



25 
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vitro) and assessing the activity of the encoded portion 
cf Tango-77. 

The invention further encompasses nucleic acid 
molecules that differ from the nucleotide sequence of SEQ 
ID N0:1, SEQ ID N0:3, SEQ ID NO:6, SEQ ID NO:10, or the 
cDNA of ATCC 98807 due to degeneracy of the genetic code 
and thus encode the same Tango-77 protein as that encoded 
by the nucleotide sequence shown in SEQ ID N0:1, SEQ ID 
NO: 3, SEQ ID NO : 6 , SEQ ID NO: 10, or the cDNA of ATCC 
98807. 

In addition to the human Tango-77 nucleotide 
sequence shown in SEQ ID NO:l, SEQ ID NO: 3, SEQ ID NO: 6, 
SEQ ID NO: 10, or the cDNA of ATCC 98807, it will be 
appreciated by those skilled in the art that DNA sequence 
polymorphisms that lead to changes in the amino acid 
sequences of Tango-77 may exist within a population 
(e.g., the human population). Such genetic polymorphism 
in the Tango-77 gene may exist among individuals within a 
population due to natural allelic variation. An allele 
is one of a group of genes which occur alternatively at a 
given genetic locus. As used herein, the term "allelic 
variant" refers to a nucleotide sequence which occurs at 
a Tango-77 locus or to a polypeptide encoded by the 
nucleotide sequence. As used herein, the terms "gene" 
and "recombinant gene" refer to nucleic acid molecules 
comprising an open reading frame encoding a Tango-77 
protein, preferably a mammalian Tango-77 protein. Such 
natural allelic variations can typically result in 1-5% 
variance in the nucleotide sequence of the Tango-77 gene. 
Alternative alleles can be identified by sequencing the 
gene of interest in a number of different individuals. 
This can be readily carried out by using hybridization 
probes to identify the same genetic locus in a variety of 
individuals. Any and all such nucleotide variations and 
resulting amino acid polymorphisms or variations in 



WO 99/06426 



PCT/US98/16102 



- 17 



10 



15 



20 



25 



30 



Tango- 77 that are the result of natural allelic variation 
and that do not alter the functional activity of Tango- 77 
are intended to be within the scope of the invention. 

Moreover, nucleic acid molecules encoding Tango- 7 7 
proteins from other species (Tango-77 homologues) , which 
have a nucleotide sequence which differs from that of a 
human Tango-77, are intended to be within the scope of 
the invention. Nucleic acid molecules corresponding to 
natural allelic variants and homologues of the Tango-77 
cDNA of the invention can be isolated based on their 
identity to the human Tango-77 nucleic acids disclosed 
herein using the human cDNAs, or a portion thereof, as a 
hybridization probe according to standard hybridization 
techniques under stringent hybridization conditions. 

Accordingly, in another embodiment, an isolated 
nucleic acid molecule of the invention is at least 300 
(325, 350, 375, 400, 425, 450, 500, 550, 600, 650, 700, 
800, or 989) nucleotides in length and hybridizes under 
stringent conditions to the nucleic acid molecule 
comprising the nucleotide sequence, preferably the coding 
sequence, of SEQ ID NO-.l, SEQ ID NO: 3, SEQ ID NO: 6, SEQ 
ID NO: 10, or the cDNA of ATCC 98807. 

As used herein, the term "hybridizes under 
stringent conditions" is intended to describe conditions 
for hybridization and washing under which nucleotide 
sequences at least 60% (65%, 70%, preferably 75%) 
identical to each other typically remain hybridized to 
each other. Such stringent conditions are known to those 
skilled in the art and can be found in Current Protocols 
in Molecular Biology, John Wiley Sons, N.Y. (1989), 
6.3.1-6.3.6. A preferred, non-limiting example of 
stringent hybridization conditions are hybridization in 
6X sodium chloride /sodium citrate (SSC) at about 45°C, 
followed by one or more washes in 0.2X SSC, 0.1% SDS at 
50-65°C. Preferably, an isolated nucleic acid molecule 
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of the invention that hybridizes under stringent 
conditions to the sequence of SEQ ID NO:l, SEQ ID NO: 3, 
SEQ ID NO: 6, SEQ ID NO: 10, the cDNA of ATCC 98807, or the 
complement thereof, corresponds to a naturally- occurring 
s nucleic acid molecule. As used herein, a -naturally- 
occurring" nucleic acid molecule refers to an RNA or DNA 
molecule having a nucleotide sequence that occurs in 
nature (e.g., encodes a natural protein). 

In addition to naturally-occurring allelic 

io variants of the Tango- 77 sequence that may exist in the 
population, the skilled artisan will further appreciate 
that changes can be introduced by mutation into the 
nucleotide sequence of SEQ ID NO:l, SEQ ID NO: 3, SEQ ID 
NO: 6, SEQ ID NO: 10 or the cDNA of ATCC 98807, thereby 

15 leading to changes in the amino acid sequence of the 

encoded Tango-77 protein, without altering the biological 
activity of the Tango-77 protein. Amino acid residues 
that are not conserved or only semiconserved among Tango- 
77 of various species may be non-essential for activity 

20 and thus would likely be targets for alteration. 

Alternatively, one can make nucleotide substitutions 
leading to amino acid substitutions at "non-essential" 
amino acid residues. A "non-essential" amino acid 
residue is a residue that can be altered from the wild- 

25 type sequence of Tango-77 (e.g., the sequence of SEQ ID 
NO:2, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll 
or SEQ ID NO: 13) without altering the biological 
activity, whereas an "essential" amino acid residue is 
required for biological activity. For example, amino 

30 acid residues that are conserved among the Tango-77 

proteins of various species may be essential for activity 
and thus would not likely be targets for alteration, 
unless one wishes to reduce or alter Tango-77 activity. 
Accordingly, another aspect of the invention 

35 pertains to nucleic acid molecules encoding Tango-77 
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proteins that contain changes in amino acid residues that 
are not essential for activity. Such Tango-77 proteins 
differ in amino acid sequence from SEQ ID NO:2, SEQ ID 
NO:5, SEQ ID NO : 7 , SEQ ID NO : 9 , SEQ ID NO:ll, or SEQ ID 
5 NO: 13 yet retain biological activity. In one embodiment, 
the isolated nucleic acid molecule includes a nucleotide 
sequence encoding a protein that includes an amino acid 
sequence that is at least about 45% identical, 65%, 75%, 
85%, 95%, or 98% identical to the amino acid sequence of 
10 SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO : 7 , SEQ ID NO : 9 , SEQ 
ID NO: 11, or SEQ ID NO: 13. 

An isolated nucleic acid molecule encoding a 
Tango-77 protein having a sequence which differs from 
that of SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO : 7 , SEQ ID 
is NO: 9, SEQ ID NO: 11, or SEQ ID NO: 13 can be created by 
introducing one or more nucleotide substitutions, 
additions or deletions into the nucleotide sequence of 
SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:10, or 
the cDNA of ATCC 98807 such that one or more amino acid 
20 substitutions, additions or deletions are introduced into 
the encoded protein. Mutations can be introduced by 
standard techniques, such as site-directed mutagenesis 
and PCR-mediated mutagenesis. Preferably, conservative 
amino acid substitutions are made at one or more 
25 predicted non-essential amino acid residues. A 

"conservative amino acid substitution" is one in which 
the amino acid residue is replaced with an amino acid 
residue having a similar side chain. Families of amino 
acid residues having similar side chains have been 
30 defined in the art. These families include amino acids 
with basic side chains (e.g., lysine, arginine, 
histidine) , acidic side chains (e.g., aspartic acid, 
glutamic acid), uncharged polar side chains (e.g., 
glycine, asparagine, glut amine, serine, threonine, 
35 tyrosine, cysteine), nonpolar side chains (e.g., alanine, 
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valine, leucine, isoleucine, proline, phenylalanine, 
methionine, tryptophan), beta-branched side chains (e.g., 
threonine, valine, isoleucine) and aromatic side chains 
(e.g., tyrosine, phenylalanine, tryptophan, histidine) . 
s Thus, a predicted nonessential amino acid residue in 
Tango- 77 is preferably replaced with another amino acid 
residue from the same side chain family. Alternatively, 
mutations can be introduced randomly along all or part of 
a Tango- 77 coding sequence, such as by saturation 

10 mutagenesis, and the resultant mutants can be screened 
for Tango-77 biological activity to identify mutants that 
retain activity. Following mutagenesis, the encoded 
protein can be expressed recombinantly and the activity 
of the protein can be determined. 

!5 In a preferred embodiment, a mutant Tango-77 

protein can be assayed for: (1) the ability to form 
protein: protein interactions with proteins in the 
Tango-77 signalling pathway; (2) the ability to bind a 
Tango-77 ligand or receptor; or (3) the ability to bind 

20 to an intracellular target protein or (4) the ability to 
interact with a protein involved in inflammation or (5) 
the ability to bind the IL-1 receptor. In yet another 
preferred embodiment, a mutant Tango-77 can be assayed 
for the ability to modulate inflammation, asthma, 

25 autoimmune dieseases, and sepsis. 

The present invention encompasses antisense 
nucleic acid molecules, i.e., molecules which are 
complementary to a sense nucleic acid encoding a protein, 
e.g., complementary to the coding strand of a double- 

30 stranded cDNA molecule or complementary to an mRNA 
sequence. Accordingly, an antisense nucleic acid can 
hydrogen bond to a sense nucleic acid. The antisense 
nucleic acid can be complementary to an entire Tango-77 
coding strand, or to only a portion thereof, e.g., all or 

35 part of the protein coding region (or open reading 
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frame) . An antisense nucleic acid molecule can be 
antisense to a noncoding region of the coding strand of a 
nucleotide sequence encoding Tango-77. The noncoding 
regions ("5' and 3' untranslated regions") are the 5' and 
5 3' sequences which flank the coding region and are not 
translated into amino acids. 

Given the coding strand sequences encoding 
Tango-77 disclosed herein <e.g.,SEQ ID NO:3, SEQ ID NO:5, 
or SEQ ID N0:8), antisense nucleic acids of the invention 
lo can be designed according to the rules of Watson and 

Crick base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of 
Tango-77 mENA, but more preferably is an oligonucleotide 
which is antisense to only a portion of the coding or 
is noncoding region of Tango-77 mRNA. For example, the 
antisense oligonucleotide can be complementary to the 
region surrounding the translation start site of Tango-77 
mRNA, e.g., an oligonucleotide having the sequence 
5 ' - TGCAACTTTTACAGGAAACAC - 3 ' ( SEQ ID NO : 1 9 ) or 
20 5' -CCTCACTTTTACCCGAGACTC-3 ' (SEQ ID NO:20) or 

5' -GACGGGTGGTACTTAAAACAA-3 ' (SEQ ID NO:21). An antisense 
oligonucleotide can be, for example, about 5, 10, 15, 20, 
25, 30, 35, 40, 45 or 50 nucleotides in length. An 
antisense nucleic acid of the invention can be 
25 constructed using chemical synthesis and enzymatic 
ligation reactions using procedures known in the art. 
For example, an antisense nucleic acid (e.g., an 
antisense oligonucleotide) can be chemically synthesized 
using naturally occurring nucleotides or variously 
30 modified nucleotides designed to increase the biological 
stability of the molecules or to increase the physical 
stability of the duplex formed between the antisense and 
sense nucleic acids, e.g., phosphorothioate derivatives 
and acridine substituted nucleotides can be used. 
35 Examples of modified nucleotides which can be used to 
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generate the antisense nucleic acid include 5- 
f 1 uo rour ac i 1 , 5 - bromoura c i 1 , 5 - c hi orour ac i 1 , 
5 - iodouracil , hypoxanthine , xanthine , 4 -acetylcytosine , 
5- (carboxyhydroxylmethyl) uracil, 5- 
5 carboxymethylaminomethyl-2-thiouridine, 

5-carboxymethylaminomethyluracil , dihydrouracil , 
beta-D-galactosylqueosine, inosine, N6- 
isopentenyladenine, 1-methylguanine, 1-methylinosine, 
2 , 2 -dimethylguanine , 2 -methyladenine , 2 -methylguanine , 3 - 

10 methylcytosine, 5-methylcytosine , N6-adenine, 7- 
methylguanine , 5 -methylaminomethyluracil , 5- 
me t hoxy ami nome thyl-2-thi our a c i 1 , beta-D- manno sy 1 que o s i ne , 
5' -methoxycarboxymethyluracil , 5-methoxyuracil , 2- 
methylthio-N6-isopentenyladenine, uracil- 5 -oxyace tic acid 

is (v) , wybutoxosine, pseudouracil , queosine, 2- 

thiocytosine , 5 -methyl - 2 - thiouracil , 2 - thiouracil , 4 - 
thiouracil, 5-methyluracil, uracil- 5 -oxyacetic acid 
methylester, uracil-5-oxyacetic acid (v) , 5-methyl-2- 
thiouracil, 3- (3-amino-3-N-2-carboxypropyl) uracil 

20 (acp3)w, and 2, 6-diaminopurine . Alternatively, the 

antisense nucleic acid can be produced biologically using 
an expression vector into which a nucleic acid has been 
subcloned in an antisense orientation (i.e., RNA 
transcribed from the inserted nucleic acid will be of an 

25 antisense orientation to a target nucleic acid of 

interest, described further in the following subsection) . 

The antisense nucleic acid molecules of the 
invention are typically administered to a subject or 
generated in situ such that they hybridize with or bind 

30 to cellular mRNA and/or genomic DNA encoding a Tango- 77 
protein to thereby inhibit expression of the protein, 
e.g., by inhibiting transcription and/or translation. 
The hybridization can be by conventional nucleotide 
complementarity to form a stable duplex, or, for example, 

35 in the case of an antisense nucleic acid molecule which 
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binds Co DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a 
route of administration of antisense nucleic acid 
molecules of the invention includes direct injection at a 
tissue site. Alternatively, antisense nucleic acid 
molecules can be modified to target selected cells and 
then administered system! cally. For example, for 
systemic administration, antisense molecules can be 
modified such that they specifically bind to receptors or 
o antigens expressed on a selected cell surface, e.g., by 
linking the antisense nucleic acid molecules to peptides 
or antibodies which bind to cell surface receptors or 
antigens. The antisense nucleic acid molecules can also 
be delivered to cells using the vectors described herein, 
s To achieve sufficient intracellular concentrations of the 
antisense molecules, vector constructs in which the 
antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are 
preferred. 

o An antisense nucleic acid molecule of the 

invention can be an a-anomeric nucleic acid molecule. An 
a-anomeric nucleic acid molecule forms specific double- 
stranded hybrids with complementary RNA in which, 
contrary to the usual 0-units, the strands run parallel 
25 to each other (Gaultier et al . (1987) Nucleic Acids Res. 
15:6625-6641) . The antisense nucleic acid molecule can 
also comprise a 2 ' -o-methylribonucleotide (Inoue et al . 
(1987) Nucleic Acids Res. 15:6131-6148) or a chimeric 
RNA -DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327- 

30 330) . 

The invention also encompasses ribozymes. 
Ribozymes are catalytic RNA molecules with ribonuclease 
activity which are capable of cleaving a single-stranded 
nucleic acid, such as an mRNA, to which they have a 
35 complementary region. Thus, ribozymes (e.g., hammerhead 
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ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave 
Tango-77 mRNA transcripts to thereby inhibit translation 
of Tango-77 mRNA. A ribozyme having specificity for a 
Tango- 77 -encoding nucleic acid can be designed based upon 
the nucleotide sequence of a Tango-77 cDNA disclosed 
herein (e.g., SEQ ID NO:l, SEQ ID N0:3, SEQ ID NO:6, SEQ 
ID NO: 10) . For example, a derivative of a Tetrahymena L- 
19 IVS RNA can be constructed in which the nucleotide 
sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a Tango- 7 7 -encoding 
mRNA. See, e.g., Cech et al . U.S. Patent No. 4,987,071; 
and Cech et al . U.S. Patent No. 5,116,742. 
Alternatively, Tango-77 mRNA can be used to select a 
catalytic RNA having a specific ribonuclease activity 
from a pool of RNA molecules. See, e.g., Bartel and 
Szostak (1993) Science 261:1411-1418. 

The invention also encompasses nucleic acid 
molecules which form triple helical structures. For 
example, Tango-77 gene expression can be inhibited by 
targeting nucleotide sequences complementary to the 
regulatory region of the Tango-77 (e.g., the Tango-77 
promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the Tango-77 
gene in target cells. See generally, Helene (1991) 
Anticancer Drug Des. 6(6) :569-84; Helene (1992) Ann. N.Y. 
Acad. Sci. 660:27-36; and Maher (1992) Bioassays 
14 (12) :807-15. 

In preferred embodiments, the nucleic acid 
molecules of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, 
e.g., the stability, hybridization, or solubility of the 
molecule. For example, the deoxyribose phosphate 
backbone of the nucleic acids can be modified to generate 
peptide nucleic acids (see Hyrup et al. (1996) Bioorganic 
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& Medicinal Chemistry 4(1) : 5-23) . As used herein, the 
terms "peptide nucleic acids" or "PNAs" refer to nucleic 
acid mimics, e.g., DNA mimics, in which the deoxyribose 
phosphate backbone is replaced by a pseudopeptide 
backbone and only the four natural nucleobases are 
retained. The neutral backbone of PNAs has been shown to 
allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA 
oligomers can be performed using standard solid phase 
peptide synthesis protocols as described in Hyrup et al . 
(1996) supra; Perry-0' Keef e et al . (1996) Proc. Natl. 
Acad. Sci. USA 93: 14670-675. 

PNAs of Tango-77 can be used in therapeutic and 
diagnostic applications. For example, PNAs can be used 
as antisense or antigene agents for sequence-specific 
modulation of gene expression by, e.g., inducing 
transcription or translation arrest or inhibiting 
replication. PNAs of Tango-77 can also be used, e.g., in 
the analysis of single base pair mutations in a gene by, 
e.g., PNA directed PCR clamping; as artificial 
restriction enzymes when used in combination with other 
enzymes, e.g., SI nucleases (Hyrup (1996) supra; or as 
probes or primers for DNA sequence and hybridization 
(Hyrup (1996) supra; Perry-0' Keef e et al . (1996) Proc. 
Natl. Acad. Sci. USA 93: 14670-675). 

In another embodiment, PNAs of Tango-77 can be 
modified, e.g., to enhance their stability or cellular 
uptake, by attaching lipophilic or other helper groups to 
PNA, by the formation of PNA -DNA chimeras, or by the use 
o of liposomes or other techniques of drug delivery known 
in the art. For example, PNA -DNA chimeras of Tango-77 
can be generated which may combine the advantageous 
properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e.g., RNAse H and DNA polymerases, 
is to interact with the DNA portion while the PNA portion 
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would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of 
appropriate lengths selected in terms of base stacking, 
number of bonds between the nucleobases, and orientation 
5 (Hyrup (1996) supra). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) supra and 
Finn et al . (1996) Nucleic Acids Res. 24 (17) : 3357-63 . 
For example, a DNA chain can be synthesized on a solid 
support using standard phosphoramidite coupling chemistry 

10 and modified nucleoside analogs. Compounds such as 5'- 
(4-methoxytrityl)amino-5' -deoxy- thymidine phosphoramidite 
can be used as a link between the PNA and the 5' end of 
DNA (Mag et al . (1989) Nucleic Acid Res. 17:5973-88). 
PNA monomers are then coupled in a stepwise manner to 

15 produce a chimeric molecule with a 5' PNA segment and a 
3' DNA segment (Finn et al . (1996) Nucleic Acids Res. 
24 (17) : 3357-63) . Alternatively, chimeric molecules can 
be synthesized with a 5' DNA segment and a 3' PNA segment 
(Peterser et al . (1975) Bioorganic Med. Chem. Lett. 

20 5:1119-11124). 

In other embodiments, the oligonucleotide may 
include other appended groups such as peptides (e.g., for 
targeting host cell receptors in vivo) , or agents 
facilitating transport across the cell membrane (see, 

25 e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 
86:6553-6556; Lemaitre et al . (1987) Proc. Natl. Acad. 
Sci. USA 84:648-652; PCT Publication No. WO 88/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. 
W0 89/10134) . In addition, oligonucleotides can be 

30 modified with hybridization-triggered cleavage agents 
(see, e.g., Krol et al. (1988) Bio/Techniques 6:958-976) 
or intercalating agents (see, e.g., Zon (1988) Pharm. 
Res. 5:539-549). To this end, the oligonucleotide may be 
conjugated to another molecule, e.g., a peptide, 
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hybridization triggered cross -linking agent, transport 
agent, hybridization- triggered cleavage agent, etc. 

II. Isolated Tango-77 Proteins and Anti-Tanqo-77 
Antibodies 

One aspect of the invention pertains to isolated 
Tango- 77 proteins, and biologically active portions 
thereof, as well as polypeptide fragments suitable for 
use as immunogens to raise anti-Tango-77 antibodies. In 
one embodiment, native Tango- 77 proteins can be isolated 
from cells or tissue sources by an appropriate 
purification scheme using standard protein purification 
techniques. In another embodiment, Tango- 77 proteins are 
produced by recombinant DNA techniques. Alternative to 
recombinant expression, a Tango-77 protein or polypeptide 
can be synthesized chemically using standard peptide 
synthesis techniques. 

An "isolated" or "purified" protein or 
biologically active portion thereof is substantially free 
of cellular material or other contaminating proteins from 
the cell or tissue source from which the Tango-77 protein 
is derived, or substantially free of chemical precursors 
or other chemicals when chemically synthesized. The 
language "substantially free of cellular material" 
includes preparations of Tango-77 protein in which the 
protein is separated from cellular components of the 
cells from which it is isolated or recombinantly 
produced. Thus, Tango-77 protein that is substantially 
free of cellular material includes preparations of 
Tango-77 protein having less than about 30%, 20%, 10%, or 
5% (by dry weight) of non-Tango-77 protein (also referred 
to herein as a "contaminating protein"). When the 
Tango-77 protein or biologically active portion thereof 
is recombinantly produced, it is also preferably 
substantially free of culture medium, i.e., culture 
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medium represents less than about 20%, 10%, or 5% of the 
volume of the protein preparation. When Tango- 77 protein 
is produced by chemical synthesis, it is preferably 
substantially free of chemical precursors or other 

5 chemicals, i.e., it is separated from chemical precursors 
or other chemicals which are involved in the synthesis of 
the protein. Accordingly such preparations of Tango- 77 
protein have less than about 30%, 20%, 10%, 5% (by dry 
weight) of chemical precursors or non-Tango-77 chemicals. 

o Biologically active portions of a Tango-77 protein 

include peptides comprising amino acid sequences 
sufficiently identical to or derived from the amino acid 
sequence of the Tango-77 protein (e.g., the amino acid 
sequence shown in SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 7, 

5 SEQ ID NO:9, SEQ ID NO;ll, or SEQ ID NO:13), which 

include fewer amino acids than the full length Tango-77 
proteins, and exhibit at least one activity of a Tango-77 
protein. Typically, biologically active portions 
comprise a domain or motif with at least one activity of 

o the Tango-77 protein. A biologically active portion of a 
Tango-77 protein can be a polypeptide which is, for 
example, 10, 25, 50, 100 or more amino acids in length. 

Moreover, other biologically active portions, in 
which other regions of the protein are deleted, can be 

5 prepared by recombinant techniques and evaluated for one 
or more of the functional activities of a native Tango-77 
protein. 

Preferred Tango-77 protein has the amino acid 
sequence shown of SEQ ID NO:2, SEQ ID NO: 5, SEQ ID NO: 7, 

o SEQ ID NO: 9, SEQ ID NO: 11, or SEQ ID NO: 13. Other useful 
Tango-77 proteins are substantially identical to SEQ ID 
NO:2, SEQ ID NO: 5, SEQ ID NO : 7 , SEQ ID NO: 9, SEQ ID 
NO: 11, or SEQ ID NO: 13 and retain the functional activity 
of the protein of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 7, 

5 SEQ ID NO: 9, SEQ ID NO: 11, or SEQ ID NO: 13 yet differ in 
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amino acid sequence due to natural allelic variation or 
mutagenesis. Accordingly, a useful Tango-77 protein is a 
protein which includes an amino acid sequence at least 
about 45%, preferably 55%, 65%, 75%, 85%, 95%, or 99% 
s identical to the amino acid sequence of SEQ ID NO: 2, SEQ 
ID NO: 5, SEQ ID NO : 7 , SEQ ID NO : 9 , SEQ ID NO: 11, or SEQ 
ID NO: 13 and retains the functional activity of the 
Tango-77 proteins of SEQ ID NO:2, SEQ ID NO:5, SEQ ID 
NO:7, SEQ ID NO:9, SEQ ID NO:ll, or SEQ ID NO:13. In a 
io preferred embodiment, the Tango-77 protein retains a 
functional activity of the Tango-77 protein of SEQ ID 
NO:2, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO : 9 , SEQ ID 
NO: 11, or SEQ ID NO: 13. 

To determine the percent identity of two amino 
is acid sequences or of two nucleic acids, the sequences are 
aligned for optimal comparison purposes (e.g., gaps can 
be introduced in the sequence of a first amino acid or 
nucleic acid sequence for optimal alignment with a second 
amino or nucleic acid sequence) . The amino acid residues 
20 or nucleotides at corresponding amino acid positions or 
nucleotide positions are then compared. When a position 
in the first sequence is occupied by the same amino acid 
residue or nucleotide as the corresponding position in 
the second sequence, then the molecules are identical at 
25 that position. The percent identity between the two 
sequences is a function of the number of identical 
positions shared by the sequences (i.e., % identity = # 
of identical positions/total # of positions, e.g., 
overlapping x 100) . Preferably, the two sequences are 
30 the same length. 

The determination of percent homology between two 
sequences can be accomplished using a mathematical 
algorithm. A preferred, non- limiting example of a 
mathematical algorithm utilized for the comparison of two 
35 sequences is the algorithm of Karlin and Altschul (1990) 
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Proc. Natl. Acad. Sci. USA 87:2264-2268, modified as in 
Karl in and Altschul (1993) Proc. Natl. Acad. Sci. USA 
90:5873-5877. Such an algorithm is incorporated into the 
NBLAST and XBLAST programs of Altschul, et al . (1990) 
s J. Mol. Biol. 215:403-410. BLAST nucleotide searches can 
be performed with the NBLAST program, score - 100, 
wordlength = 12 to obtain nucleotide sequences homologous 
to Tango- 77 nucleic acid molecules of the invention. 
BLAST protein searches can be performed with the XBLAST 

10 program, score = 50, wordlength = 3 to obtain amino acid 
sequences homologous to Tango- 77 protein molecules of the 
invention. To obtain gapped alignments for comparison 
purposes, Gapped BLAST can be utilized as described in 
Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402. 

is When utilizing BLAST and Gapped BLAST programs, the 
default parameters of the respective programs (e.g., 
XBLAST and NBLAST) can be used. See 

http://www.ncbi.nlm.nih.gov. Another preferred, non- 
limiting example of a mathematical algorithm utilized for 

20 the comparison of sequences is the algorithm of Myers and 
Miller, CABIOS (1989). Such an algorithm is incorporated 
into the ALIGN program (version 2.0) which is part of the 
GCG sequence alignment software package. When utilizing 
the ALIGN program for comparing amino acid sequences, a 

25 PAM120 weight residue table, a gap length penalty of 12, 
and a gap penalty of 4 can be used. 

The percent identity between two sequences can be 
determined using techniques similar to those described 
above, with or without allowing gaps. In calculating 

30 percent identity, only exact matches are counted. 

The invention also provides Tango- 77 chimeric or 
fusion proteins. As used herein, a Tango-77 "chimeric 
protein" or "fusion protein" comprises a Tango-77 
polypeptide operably linked to a non-Tango- 77 

35 polypeptide. A "Tango-77 polypeptide" refers to a 
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polypeptide having an amino acid sequence corresponding 
to Tango-77 polypeptides, whereas a "non-Tango-77 
polypeptide" refers to a polypeptide having an amino acid 
sequence corresponding to a protein which is not 
substantially identical to the Tango-77 protein, e.g., a 
protein which is different from the Tango-77 protein and 
which is derived from the same or a different organism. 
Within a Tango-77 fusion protein the Tango-77 polypeptide 
can correspond to all or a portion of a Tango-77 protein, 
preferably at least one biologically active portion of a 
Tango-77 protein. Within the fusion protein, the term 
"operably linked" is intended to indicate that the 
Tango-77 polypeptide and the non-Tango-77 polypeptide are 
fused in- frame to each other. The non-Tango-77 
polypeptide can be fused to the N- terminus or C- terminus 
of the Tango-77 polypeptide. 

One useful fusion protein is a GST-Tango-77 fusion 
protein in which the Tango-77 sequences are fused to the 
C- terminus of the GST sequences. Such fusion proteins 
o can facilitate the purification of recombinant Tango-77. 

In another embodiment, the fusion protein is a 
Tango-77 protein containing a heterologous signal 
sequence at its N-terminus. For example, the native 
Tango-77 signal sequence (i.e., about amino acids 1 to 63 
5 of SEQ ID NO: 2; SEQ ID N0:4; or about amino acids 1 to 52 
of SEQ ID NO:7;SEQ ID NO: 8; or about amino acids 1 to 21 
of SEQ ID NO: 11; SEQ ID NO: 12) can be removed arid replaced 
with a signal sequence from another protein. In certain 
host cells (e.g., mammalian host cells), expression 
o and/or secretion of Tango-77 can be increased through use 
of a heterologous signal sequence. For example, the gp6 7 
secretory sequence of the baculovirus envelope protein 
can be used as a heterologous signal sequence (Ausubel et 
al., supra). Other examples of eukaryotic heterologous 
5 signal sequences include the secretory sequences of 
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melittin and human placental alkaline phosphatase 
(Stratagene; La Jolla, California) . In yet another 
example, useful prokaryotic heterologous signal sequences 
include the phoA secretory signal {Sambrook et al., 
5 supra) and the protein A secretory signal (Pharmacia 
Biotech; Piscataway, New Jersey) . 

In yet another embodiment, the fusion protein is 
an Tango -77 -immunoglobulin fusion protein in which all or 
part of Tango- 77 is fused to sequences derived from a 

10 member of the immunoglobulin protein family. The 

Tango-77 -immunoglobulin fusion proteins of the invention 
can be incorporated into pharmaceutical compositions and 
administered to a subject to inhibit an interaction 
between a Tango-77 ligand and a Tango-77 receptor on the 

is surface of a cell, to thereby suppress Tango- 77 -mediated 
signal transduction in vivo. The Tango-77-immunoglobulin 
fusion proteins can be used to affect the bioavailability 
of a' Tango-77 cognate ligand. Inhibition of the Tango-77 
ligand/Tango-77 interaction may be useful therapeutically 

20 for both the treatment of inflammatory and autoimmune 
disorders. Moreover, the Tango-77 -immunoglobulin fusion 
proteins of the invention can be used as immunogens to 
produce anti-Tango-77 antibodies in a subject, to purify 
Tango-77 ligands and in screening assays to idencify 

25 molecules which inhibit the interaction of Tango-77 with 
a Tango-77 receptor. 

Preferably, a Tango-77 chimeric or fusion protein 
of the invention is produced by standard recombinant DNA 
techniques. For example, DNA fragments coding for the 

30 different polypeptide sequences are ligated together in- 
frame in accordance with conventional techniques, for 
example by employing blunt -ended or stagger-ended termini 
for ligation, restriction enzyme digestion to provide for 
appropriate termini, filling- in of cohesive ends as 

35 appropriate, alkaline phosphatase treatment to avoid 
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undesirable joining, and enzymatic ligation. In another 
embodiment, the fusion gene can be synthesized by 
conventional techniques including automated DNA 
synthesizers. Alternatively, PCR amplification of gene 
fragments can be carried out using anchor primers which 
give rise to complementary overhangs between two 
consecutive gene fragments which can subsequently be 
annealed and reamplified to generate a chimeric gene 
sequence (see, e.g., Current Protocols in Molecular 
Biology, Ausubel et al . eds . , John Wiley & Sons: 1992). 
Moreover, many expression vectors are commercially 
available that already encode a fusion moiety (e.g., a 
GST polypeptide) . An Tango- 77 -encoding nucleic acid can 
be cloned into such an expression vector such that the 
fusion moiety is linked in-frame to the Tango-77 protein. 

The present invention also pertains to variants of 
the Tango-77 proteins (i.e., proteins having a sequence 
which differs from that of the Tango-77 amino acid 
sequence) . Such variants can function as either Tango-77 
agonists (mimetics) or as Tango-77 antagonists. Variants 
of the Tango-77 protein can be generated by mutagenesis, 
e.g., discrete point mutation or truncation of the 
Tango-77 protein. An agonist of the Tango-77 protein can 
retain substantially the same, or a subset, of the 
biological activities of the naturally occurring form of 
the Tango-77 protein. An antagonist of the Tango-77 
protein can inhibit one or more of the activities of the 
naturally occurring form of the Tango-77 protein by, for 
example, competitively binding to a downstream or 
10 upstream member of a cellular signaling cascade which 

includes the Tango-77 protein. Thus, specific biological 
effects can be elicited by treatment with a variant of 
limited function. Treatment of a subject with a variant 
having a subset of the biological activities of the 
35 naturally occurring form of the protein can have fewer 
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side effects in a subject relative to treatment with the 
naturally occurring form of the Tango-77 proteins. 

Variants of the Tango-77 protein which function a; 
either Tango-77 agonists (mimetics) or as Tango-77 
5 antagonists can be identified by screening combinatorial 
libraries of mutants, e.g., truncation mutants, of the 
Tango-77 protein for Tango-77 protein agonist or 
antagonist activity. In one embodiment, a variegated 
library of Tango-77 variants is generated by 

10 combinatorial mutagenesis at the nucleic acid level and 
is encoded by a variegated gene library. A variegated 
library of Tango-77 variants can be produced by, for 
example, enzymatically ligating a mixture of synthetic 
oligonucleotides into gene sequences such that a 

is degenerate set of potential Tango-77 sequences is 

expressible as individual polypeptides, or alternatively, 
as a set of larger fusion proteins (e.g., for phage 
display) containing the set of Tango-77 sequences 
therein. There are a variety of methods which can be 

20 used to produce libraries of potential Tango-77 variants 
from a degenerate oligonucleotide sequence. Chemical 
synthesis of a degenerate gene sequence can be performed 
in an automatic DNA synthesizer, and the synthetic gene 
then ligated into an appropriate expression vector. Use 

25 of a degenerate set of genes allows for the provision, in 
one mixture, of all of the sequences encoding the desired 
set of potential Tango-77 sequences. Methods for 
synthesizing degenerate oligonucleotides are known in the 
art (see, e.g., Narang (1983) Tetrahedron 39:3; Itakura 

30 et al. (1984) Annu. Rev. Biochem. 53:323; Itakura et al . 
(1984) Science 198:1056; Ike et al . (1983) Nucleic Acid 
Res. 11:477) . 

In addition, libraries of fragments of the 
Tango-77 protein coding sequence can be used to generate 

35 a variegated population of Tango-77 fragments for 
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screening and subsequent selection of variants of a 
Tango- 77 protein. In one embodiment, a library of coding 
sequence fragments can be generated by treating a double 
stranded PCR fragment of a Tango- 77 coding sequence with 
a nuclease under conditions wherein nicking occurs only 
about once per molecule, denaturing the double stranded 
DNA, renaturing the DNA to form double stranded DNA which 
can include sense/antisense pairs from different nicked 
products, removing single stranded portions from reformed 
duplexes by treatment with SI nuclease, and ligating the 
resulting fragment library into an expression vector. By 
this method, an expression library can be derived which 
encodes N- terminal and internal fragments of various 
sizes of the Tango-77 protein. 

Several techniques are known in the art for 
screening gene products of combinatorial libraries made 
by point mutations or truncation, and for screening cDNA 
libraries for gene products having a selected property. 
Such techniques are adaptable for rapid screening of the 
o gene libraries generated by the combinatorial mutagenesis 
of Tango-77 proteins. The most widely used techniques, 
which are amenable to high through-put analysis, for 
screening large gene libraries typically include cloning 
the gene library into replicable expression vectors, 
5 transforming appropriate cells with the resulting library 
of vectors, and expressing the combinatorial genes under 
conditions in which detection of a desired activity 
facilitates isolation of the vector encoding the gene 
whose product was detected. Recursive ensemble 
o mutagenesis (REM) , a technique which enhances the 

frequency of functional mutants in the libraries, can be 
used in combination with the screening assays to identify 
Tango-77 variants (Arkin and Yourvan (1992) Proc. Natl. 
Acad. Sci. USA 89:7811-7815; Delgrave et al . (1993) 
is Protein Engineering 6(3) : 327-331) . 
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An isolated Tango- 77 protein, or a portion or 
fragment thereof, can be used as an immunogen to generate 
antibodies that bind Tango-77 using standard techniques 
for polyclonal and monoclonal antibody preparation. The 
full-length Tango-77 protein can be used or, 
alternatively, the invention provides antigenic peptide 
fragments of Tango-77 for use as immunogens. The 
antigenic peptide of Tango-77 comprises at least 8 
(preferably 10, 15, 20, or 30) amino acid residues of the 
amino acid sequence shown in SEQ ID NO:2, SEQ ID NO: 5, 
SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11 or SEQ ID N0:13 
and encompasses an epitope of Tango-77 such that an 
antibody raised against the peptide forms a specific 
immune complex with Tango-77. 

A Tango-77 immunogen typically is used to prepare 
antibodies by immunizing a suitable subject (e.g., 
rabbit, goat, mouse or other mammal) with the immunogen. 
An appropriate immunogenic preparation can contain, for 
example, recombinant ly expressed Tango-77 protein or a 
chemically synthesized Tango-77 polypeptide. The 
preparation can further include an adjuvant, such as 
Freund's complete or incomplete adjuvant, or similar 
immunostimulatory agent. Immunization of a suitable 
subject with an immunogenic Tango-77 preparation induces 
a polyclonal ant i -Tango-77 antibody response. 

Accordingly, another aspect of the invention 
pertains to anti-Tango-77 antibodies. The term 
"antibody" as used herein refers to immunoglobulin 
molecules and immunologically active portions of 
immunoglobulin molecules, i.e., molecules that contain an 
antigen binding site which specifically binds an antigen, 
such as Tango-77. A molecule which specifically binds to 
Tango-77 is a molecule which binds Tango-77, but does not 
substantially bind other molecules in a sample, e.g., a 
biological sample, which naturally contains Tango-77. 
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Examples of immunologically active portions of 
immunoglobulin molecules include F(ab) and F(ab') 2 
fragments which can be generated by treating the antibody 
with an enzyme such as pepsin. The invention provides 
5 polyclonal and monoclonal antibodies that bind Tango-77. 
The term "monoclonal antibody" or "monoclonal antibody 
composition" , as used herein, refers to a population of 
antibody molecules that contain only one species of an 
antigen binding site capable of immunoreacting with a 
10 particular epitope of Tango-77. A monoclonal antibody 
composition thus typically displays a single binding 
affinity for a particular Tango-77 protein with which it 
immunoreacts . 

Polyclonal anti-Tango-77 antibodies can be 
is prepared as described above by immunizing a suitable 
subject with a Tango-77 immunogen. The anti-Tango-7 7 
antibody titer in the immunized subject can be monitored 
over time by standard techniques, such as with an enzyme 
linked immunosorbent assay (ELISA) using immobilized 
20 Tango-77. If desired, the antibody molecules directed 
against Tango-77 can be isolated from the mammal (e.g., 
from the blood) and further purified by well-known 
techniques, such as protein A chromatography to obtain 
the IgG fraction. At an appropriate time after 
25 immunization, e.g., when the anti-Tango-77 antibody 
titers are highest, antibody -producing cells can be 
obtained from the subject and used to prepare monoclonal 
antibodies by standard techniques, such as the hybridoma 
technique originally described by Kohler and Milstein 
30 (1975) Nature 256:495-497, the human B cell hybridoma 
technique (Kozbor et al . (1983) Immunol Today 4:72), the 
EBV-hybridoma technique (Cole et al. (1985), Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 
77-96) or trioma techniques. The technology for 
35 producing hybridomas is well known (see generally Current 
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Protocols in Immunology (1994) Coligan et al . (eds.) John 
Wiley & Sons, Inc., New York, NY) . Briefly, an immortal 
cell line (typically a myeloma) is fused to lymphocytes 
(typically splenocytes) from a mammal immunized with a 
Tango- 77 immunogen as described above, and the culture 
supernatants of the resulting hybridoma cells are 
screened to identify a hybridoma producing a monoclonal 
antibody that binds Tango- 77. 

Any of the many well known protocols used for 
fusing lymphocytes and immortalized cell lines can be 
applied for the purpose of generating an anti -Tango- 77 
monoclonal antibody (see, e.g., Current Protocols in 
Immunology, supra; Galfre et al . (1977) Mature 266:55052; 
R.H. Kenneth, in Monoclonal Antibodies: A New Dimension 
In Biological Analyses, Plenum Publishing Corp., New 
York, New York (1980); and Lerner (1981) Yale J. Biol. 
Med., 54:3 87-402. Moreover, the ordinarily skilled 
worker will appreciate that there are many variations of 
such methods which also would be useful. Typically, the 
immortal cell line (e.g., a myeloma cell line) is derived 
from the same mammalian species as the lymphocytes. For 
example, murine hybridomas can be made by fusing 
lymphocytes from a mouse immunized with an immunogenic 
preparation of the present invention with an immortalized 
mouse cell line, e.g., a myeloma cell line that is 
sensitive to culture medium containing hypoxanthine, 
aminopterin and thymidine ("HAT medium") . Any of a 
number of myeloma cell lines can be used as a fusion 
partner according to standard techniques, e.g., the P3- 
NSl/l-Ag4-l, P3-x63-Ag8.653 or Sp2/0-Agl4 myeloma lines. 
These myeloma lines are available from ATCC. Typically, 
HAT-sensitive mouse myeloma cells are fused to mouse 
splenocytes using polyethylene glycol ("PEG"). Hybridoma 
cells resulting from the fusion are then selected using 
HAT medium, which kills unfused and unproductively fused 
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myeloma cells (unfused splenocytes die after several days 
because they are not transformed) . Hybridoma cells 
producing a monoclonal antibody of the invention are 
detected by screening the hybridoma culture supernatants 
for antibodies that bind Tango-77. e.g., using a standard 

ELISA assay. 

Alternative to preparing monoclonal antibody- 
secreting hybridomas, a monoclonal anti-Tango-77 antibody 
can be identified and isolated by screening a recombinant 
combinatorial immunoglobulin library (e.g., an antibody 
phage display library) with Tango-77 to thereby isolate 
immunoglobulin library members that bind Tango-77. Kits 
for generating and screening phage display libraries are 
commercially available (e.g., the Pharmacia Recombinant 
; Phage Antibody System, Catalog No. 27-9400-01; and the 
Stratagene SurfZAP™ Phage Display Kit, Catalog No. 
240612) . Additionally, examples of methods and reagents 
particularly amenable for use in generating and screening 
antibody display library can be found in, for example, 
o U.S. Patent No. 5,223,409; PCT Publication No. WO 
92/18619; PCT Publication No. WO 91/17271; PCT 
Publication No. WO 92/20791; PCT Publication No. WO 
92/15679; PCT Publication No. WO 93/01288; PCT 
Publication No. WO 92/01047; PCT Publication No. WO 
> 5 92/09690; PCT Publication No. WO 90/02809; Fuchs et al. 
(1991) Bio/Technology 9:1370-1372; Hay et al . (1992) Hum. 

, » • „ , Q , oc. uuse et al. (1989) Science 
AntiJbod. Hybridomas 3:81-85, Huse et ai „„. _,. 

~ •**^«-v*a al (1993) EMBO J 12:725-734. 
246:1275-1281; Griffiths et al . 

Additionally, recombinant anti-Tango-77 
30 antibodies, such as chimeric and humanized monoclonal 

antibodies, comprising both human and non-human portions, 
which can be made using standard recombinant DNA 
techniques, are within the scope of the invention. Such 
chimeric and humanized monoclonal antibodies can be 
35 produced by recombinant DNA techniques known in the art, 
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for example using methods described in PCT Publication 
No. WO 87/02671; European Patent Application 184,187- 
European Patent Application 171,496; European Patent' 
Application 173,494; PCT Publication No. WO 86/01533- 
s US. patent No. 4,816,567; European Patent Application 
125 023; Better et al . ( 198 8) Science 240:1041-1043; Liu 
et al. (1987) Proc. Natl . Acad. Sci. USA 84:3439-3443- 
Liu et al. (l 987) j. Immunol . 139 : 35 21-3526; Sun et al ' 
(1987) Proc. Natl. Acad. Sci. USA 84:214-218; Nishimura 
io et al. (1987) Cane. Res. 47:999-1005; Wood et al. (1985) 
mature 314:446-449; and Shaw et al . (1988) j. Natl 
Cancer Inst. 80:1553-1559); Morrison (1985) Science 
229:1202-1207; Oi et al . (1986) Bio/Techniques 4-214- 
U.S. Patent 5,225,539; Jones et al. (1986) Wature 
is 321:552-525; Verhoeyan et al. (1988) Science 239 • 1534 • 
and Beidler et al . (i 988) JiranunoJ . 141 :4 053-4060. ' 

Completely human antibodies are particularly 
desirable for therapeutic treatment of human patients 
Such antibodies can be produced using transgenic mice 
2 o whxch are incapable of expressing endogenous 

immunoglobulin heavy and light chains genes, but which 
can express human heave and light chain genes. The 
transgenic mice are immunized in the normal fashion with 
a selected antigen, e.g., all or a portion of Tango-77 
2s Monoclonal antibodies directed against the antigen can be 
obtained using conventional hybridoma technology. The 
human immunoglobulin transgenes harbored by the 
transgenic mice rearrange during B cell differentiation 
and subsequently undergo class switching and somatic ' 
30 mutation. Thus, using such a technique, it is possible 
to produce therapeutically useful IgG, Ig A and IgE 
antibodies. For an overview of this technology for 
producing human antibodies, see Lonberg and Huszar (1995 
int Sev. Iimunol . 13:65-93). For a detailed discussion' 
35 of thx. technology for producing human antibodies and 
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human monoclonal antibodies and protocols for producing 
such antibodies, see, e.g., U.S. Patent 5,625,126; U.S. 
Patent 5,633,425; U.S. Patent 5,569,825; U.S. Patent 
5 661,016; and U.S. Patent 5,545,806. In addition, 
companies such as Abgenix, Inc. (Freemont, CA> , can be 
engaged to provide human antibodies directed against a 
selected antigen using technology similar to the 

described above. 

Completely human antibodies which recognize a 
3 selected epitope can be generated using a technique 
referred to as "guided selection." In this approach a 
selected non-human monoclonal antibody, e.g., a murine 
antibody, is used to guide the selection of a completely 
human antibody recognizing the same epitope. 
5 First, a non-human monoclonal antibody which binds 

a selected antigen (epitope), e.g., an antibody which 
inhibits Tango-77 activity, is identified. The heave 
chain and the light chain of the non-human antibody are 
cloned and used to create phage display Fab fragments. 
20 For example, the heave chain gene can be cloned into a 
plasmid vector so that the heavy chain can be secreted 
from bacteria. The light chain gene can be cloned into a 
phage coat protein gene so that the light chain can be 
expressed on the surface of phage. A repertoire (random 
2 s collection) of human light chains fused to phage is used 
to infect the bacteria which express the non-human heavy 
chain. The resulting progeny phage display hybrid 
antibodies (human light chain/non-human heavy chain) . 
The selected antigen is used in a panning screen to 
30 select phage which bind the selected antigen. Several 
rounds of selection may be required to identify such 
phage. Next, human light chain genes are isolated from 
the selected phage which bind the selected antigen. 
These selected human light chain genes are then used to 
35 guide the selection of human heavy chain genes as 
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follows. The selected human light chain genes are 
inserted into vectors for expression by bacteria 
Bacteria expressing the selected human light chains are 
infected with a repertoire of human heavy chains fused to 
s phage. The resulting progeny phage display human 
antibodies (human light chain/human heavy chain) . 

Next, the selected antigen is used in a panning 
screen to select phage which bind the selected antigen 
The phage selected in this step display completely human 
3 antibody which recognize the same epitope recognized by 
the original selected, non-human monoclonal antibody 
The genes encoding both the heavy and light chains are 
readily isolated and be further manipulated for 
production of human antibody. This technology is 
i described by Jespers et al . (i 994 , Bio/ technology 12:899- 
903) . 

An anti-Tango-77 antibody (e.g., monoclonal 
antibody) can be used to isolate Tango- 77 by standard 
techniques, such as affinity chromatography or 
immunoprecipitation. An anti-Tango-77 antibody can 
facilitate the purification of natural Tango-77 from 
cells and of recombinant^ produced Tango-77 expressed in 
host cells. Moreover, an anti-Tango-77 antibody can be 
used to detect Tango-77 protein (e.g., in a cellular 
lysate or cell supernatant) in order to evaluate the 
abundance and pattern of expression of the Tango-77 
protein. Anti-Tango-77 antibodies can be used 
diagnostically to monitor protein levels in tissue as 
part of a clinical testing procedure, e.g., to for 
example, determine the efficacy of a given treatment 
regimen. Detection can be facilitated by coupling the 
antibody to a detectable substance. Examples of 
detectable substances include various enzymes, prosthetic 
groups, fluorescent materials, luminescent materials 
bioluminescent materials, and radioactive materials 
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Examples of suitable enzymes include horseradish 
peroxidase, alkaline phosphatase, /3-galactosidase, or 
acetylcholinesterase; examples of suitable prosthetic 
group complexes include streptavidin/biotin and 
5 avidin/biotin; examples of suitable fluorescent materials 
include umbellif erone , fluorescein, fluorescein 
isothiocyanate, rhodamine, dichlorotriazinylamine 
fluorescein, dansyl chloride or phycoerythrin; an example 
of a luminescent material includes luminol; examples of 
io bioluminescent materials include lucif erase, luciferin, 
and aequorin, and examples of suitable radioactive 
material include 125 I, 35 S or 3 H. 

ttt Rprombinanr Bimreaai nn vectors and Host Cells 
Another aspect of the invention pertains to 
is vectors, preferably expression vectors, containing a 
nucleic acid molecule encoding Tango- 7 7 (or a portion 
thereof) . As used herein, the term "vector" refers to a 
nucleic acid molecule capable of transporting another 
nucleic acid to which it has been linked. One type of 
20 vector is a "plasmid" , which refers to a circular double 
stranded DNA loop into which additional DNA segments can 
be ligated. Another type of vector is a viral vector, 
wherein additional DNA segments can be ligated into the 
viral genome. Certain vectors are capable of autonomous 
25 replication in a host cell into which they are introduced 
(e.g., bacterial vectors having a bacterial origin of 
replication and episomal mammalian vectors) . Other 
vectors (e.g., non-episomal mammalian vectors) are 
integrated into the genome of a host cell upon 
30 introduction into the host cell, and thereby are 

replicated along with the host genome. Moreover, certain 
vectors, expression vectors, are capable of directing the 
expression of genes to which they are operably linked, 
in general, expression vectors of utility in recombinant 
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DNA techniques are often in the form of plasmids 
(vectors) . However, the invention is intended to include 
such other forms of expression vectors, such as viral 
vectors (e.g., replication defective retroviruses, 
adenoviruses and adeno-associated viruses), which' serve 
equivalent functions. 

The recombinant expression vectors of the 
invention comprise a nucleic acid of the invention in a 
form suitable for expression of the nucleic acid in a 
host cell, which means that the recombinant expression 
vectors include one or more regulatory sequences, 
selected on the basis of the host cells to be used for 
expression, which is operably linked to the nucleic acid 
sequence to be expressed. Within a recombinant 
expression vector, "operably linked" is intended to mean 
that the nucleotide sequence of interest is linked to the 
regulatory sequence (s) in a manner which allows for 
expression of the nucleotide sequence (e.g., in an in 
vitro transcription/ translation system or in a host cell 
when the vector is introduced into the host cell) . The 
term "regulatory sequence" is intended to include 
promoters, enhancers and other expression control 
elements (e.g., polyadenylation signals) . Such 
regulatory sequences are described, for example, in 
Goeddel; Gene Expression Technology: Methods in 
Enzymology 185, Academic Press, San Diego, CA (1990) . 
Regulatory sequences include those which direct 
constitutive expression of a nucleotide sequence in many 
types of host cell and those which direct expression of 
the nucleotide sequence only in certain host cells (e.g., 
tissue-specific regulatory sequences) . it will be 
appreciated by those skilled in the art that the design 
of the expression vector can depend on such factors as 
the choice of the host cell to be transformed, the level 
of expression of protein desired, etc. The expression 
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vectors of the invention can be introduced into host 
cells to thereby produce proteins or peptides, including 
fusion proteins or peptides, encoded by nucleic acids as 
described herein (e.g., Tango-77 proteins, mutant forms 
> of Tango-77, fusion proteins, etc.). 

The recombinant expression vectors of the 
invention can be designed for expression of Tango-77 in 
prokaryotic or eukaryotic cells, e.g., bacterial cells 
such as E. coli, insect cells (using baculovirus 
o expression vectors), yeast cells or mammalian cells. 
Suitable host cells are discussed further in Goeddel, 
Gene Expression Technology: Methods in Enzymology 185, 
Academic Press, San Diego, CA (1990). Alternatively, the 
recombinant expression vector can be transcribed and 
.5 translated in vitro, tor example using T7 promoter 
regulatory sequences and T7 polymerase. 

Expression of proteins in prokaryotes is most 
often carried out in E . coli with vectors containing 
constitutive or inducible promoters directing the 
20 expression of either fusion or non- fusion proteins. 

Fusion vectors add a number of amino acids to a protein 
encoded therein, usually to the amino terminus of the 
recombinant protein. Such fusion vectors typically serve 
three purposes: 1) to increase expression of recombinant 
25 protein; 2) to increase the solubility of the recombinant 
protein; and 3) to aid in the purification of the 
recombinant protein by aeting as a ligand in affinity 
purification. Often, in fusion expression vectors, a 
proteolytic cleavage site is introduced at the junction 
30 of the fusion moiety and the recombinant protein to 
enable separation of the recombinant protein from the 
fusion moiety subsequent to purification of the fusion 
protein. Such enzymes, and their cognate recognition 
sequences, include Factor Xa, thrombin and enterokinase . 
35 Typical fusion expression vectors include pGEX (Pharmacia 
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Biotech inc. Smith and Johnson (1988) Gene 67:31-40), 
pMAL {New England Biolabs, Beverly, ma) and P RIT5 
(Pharmacia, Piscataway, NJ) which fuse glutathione S- 
transferase (GST) , maltose E binding protein, or protein 
A, respectively, to the target recombinant protein. 

Examples of suitable inducible non- fusion E. coli 
expression vectors include pTrc (Amann et al . (i 988 ) Gene 
69:301-315) and pET lid (Studier et al., Gene Expression 
Technology: Methods in Enzymology 185, Academic Press, 
San Diego, California (1990) 60-89) . Target gene 
expression from the pTrc vector relies on host RNA 
polymerase transcription from a hybrid trp-lac fusion 
promoter. Target gene expression from the pET lid vector 
relies on transcription from a T7 gnlO-lac fusion 
promoter mediated by a coexpressed viral RNA polymerase 
(T7 gnl) . This viral polymerase is supplied by host 
strains BL21(DE3) or HMS174(DE3) from a resident X 
prophage harboring a T7 gnl gene under the 
transcriptional control of the lacUV 5 promoter. 

One strategy to maximize recombinant protein 
expression in E. coli is to express the protein in a host 
bacteria with an impaired capacity to proteolytically 
cleave the recombinant protein (Gottesman, Gene 
Expression Technology: Methods in Enzymology 185, 
Academic Press, San Diego, California (1990) 119-128) . 
Another strategy is to alter the nucleic acid sequence of 
the nucleic acid to be inserted into an expression vector 
so that the individual codons for each amino acid are 
those preferentially utilized in E. coli (Wada et al. 
(1992) Nucleic Acids Res. 20:2111-2118). Such alteration 
of nucleic acid sequences of the invention can be carried 
out by standard DNA synthesis techniques. 

In another embodiment, the Tango-77 expression 
vector is a yeast expression vector. Examples of vectors 
for expression in yeast S. cerivisae include pYepSecl 
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(Baldari et al . (1987) EMBO J. 6:229-234), pMFa (Kurjan 
and Herskowitz (1982) Cell 30:933-943), pJRY88 (Schultz 
et al (1987) Gene 54:113-123), P YES2 (Invitrogen 
Corporation, San Diego, CA) , and picZ (InVitrogen Corp, 

San Diego, CA) . 

Alternatively, Tango-77 can be expressed in insect 
cells using baculovirus expression vectors. Baculovirus 
vectors available for expression of proteins in cultured 
insect cells (e.g., Sf 9 cells) include the pAc series 
(Smith et al- (1983) Mol. Cell Biol. 3:2156-2165) and the 
P VL series (Lucklow and Summers (1989) Virology 170:31- 
39) . 

in yet another embodiment, a nucleic acid of the 
invention is expressed in mammalian cells using a 
mammalian expression vector. Examples of mammalian 
expression vectors include pCDMS (Seed (1987) Nature 
329:840) and pMT2PC (Kaufman et al . (1987) EMBO J. 6:187- 
195) When used in mammalian cells, the expression 
vector's control functions are often provided by viral 
3 regulatory elements. For example, commonly used 
promoters are derived from polyoma, Adenovirus 2, 
cytomegalovirus and Simian Virus 40. For other suitable 
expression systems for both prokaryotic and eukaryotic 
cells see chapters 16 and 17 of Sambrook et al . (supra) . 
5 in another embodiment, the recombinant mammalian 

expression vector is capable of directing expression of 
the nucleic acid preferentially in a particular cell type 
(e g ., tissue-specific regulatory elements are used to 
express the nucleic acid) . Tissue-specific regulatory 
30 elements are known in the art. Non-limiting examples of 
suitable tissue-specific promoters include the albumin 
promoter (liver-specific; Pinkert et al . (1987) Genes 
Dev 1:268-277), lymphoid-specif ic promoters (Calame and 
Eaton (1988) Adv. Immunol. 43:235-275), in particular 
35 promoters of T cell receptors (Winoto and Baltimore 
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(1989) EMBOJ. 8:729-733) and immunoglobulins (Banerji et 
al. (1983) Cell 33:729-740; Queen and Baltimore (1983) 
Cell 33:741-748), neuron- specific promoters (e.g., the 
neurofilament promoter; Byrne and Ruddle (1989) Proc 
> Natl. Acad, sci. USA 86:5473-5477), pancreas -specif ic 
promoters (Edlund etal. (1985) Science 230:912-916) and 
mammary gland-specific promoters (e.g., mi i k whey 
promoter; U.S. Patent No. 4,873,316 and European 
Application Publication No. 264,166). Development ally- 
regulated promoters are also encompassed, for example the 
murine hox promoters (Kessel and Gruss (1990) Science 
249:374-379) and the ^-fetoprotein promoter (Campes and 
Tilghman (1989) Genes Dev. 3:537-546). 

The invention further provides a recombinant 
expression vector comprising a DNA molecule of the 
invention cloned into the expression vector in an 
antisense orientation. That is, the DNA molecule is 
operably linked to a regulatory sequence in a manner 
which allows for expression (by transcription of the DNA 
molecule) of an RNA molecule which is antisense to 
Tango-77 mRNA. Regulatory sequences operably linked to a 
nucleic acid cloned in the antisense orientation can be 
chosen which direct the continuous expression of the 
antisense RNA molecule in a variety of cell types, for 
instance viral promoters and/or enhancers, or regulatory 
sequences can be chosen which direct constitutive, tissue 
specific or cell type specific expression of antisense 
RNA. The antisense expression vector can be in the form 
of a recombinant plasmid, phagemid or attenuated virus in 
which antisense nucleic acids are produced under the 
control of a high efficiency regulatory region, the 
activity of which can be determined by the cell type into 
which the vector is introduced. For a discussion of the 
regulation of gene expression using antisense genes see 
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Weintraub et al . (Reviews - Trends in Genetics, Vol. 1(1) 

1986) . u 

Another aspect of the invention pertains to host 

cells into which a recombinant expression vector of the 
invention has been introduced. The terms "host cell" and 
■■recombinant host cell" are used interchangeably herein. 
It is understood that such terms refer not only to the 
particular subject cell but to the progeny or potential 
progeny of such a cell. Because certain modifications 
may occur in succeeding generations due to either 
mutation or environmental influences, such progeny may 
not, in fact, be identical to the parent cell, but are 
still included within the scope of the term as used 
herein. 

A host cell can be any prokaryotic or eukaryotic 
cell For example, Tango-77 protein can be expressed in 
bacterial cells such as E. eoli. insect cells, yeast or 
mammalian cells (such as Chinese hamster ovary cells 
(CHO) or COS cells) . Other suitable host cells are known 
o to those skilled in the art. 

Vector DNA can be introduced into prokaryotic or 
eukaryotic cells via conventional transformation or 
transfection techniques. As used herein, the terms 
-transformation- and "transfection" are intended to refer 
>5 to a variety of art- recognized techniques for introducing 
foreign nucleic acid (e.g., DNA) into a host cell, 
including calcium phosphate or calcium chloride co- 
o-ecipitation, DEAE-dextran-mediated transfection, 
iipofection, or elect roporat ion. Suitable methods for 
30 transforming or transfecting host cells can be found in 
Sambrook, et al . (supra), and other laboratory manuals. 

For stable transfection of mammalian cells, it is 
known that, depending upon the expression vector and 
transfection technique used, only a small fraction of 
35 cells may integrate the foreign DNA into their genome. 
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in order to identify and select these integrants, a gene 
that encodes a selectable marker (e.g., for resistance to 
antibiotics) is generally introduced into the host cells 
along with the gene of interest. Preferred selectable 
markers include those which confer resistance to drugs, 
such as G418, hygromycin and methotrexate. Nucleic acid 
encoding a selectable marker can be introduced into a 
host cell on the same vector as that encoding Tango-77 or 
can be introduced on a separate vector. Cells stably 
transfected with the introduced nucleic acid can be 
identified by drug selection (e.g., cells that have 
incorporated the selectable marker gene will survive, 
while the other cells die) . 

A host cell of the invention, such as a 
prokaryotic or eukaryotic host cell in culture, can be 
used to produce (i.e., express) Tango-77 protein. 
Accordingly, the invention further provides methods for 
producing Tango-77 protein using the host cells of the 
invention. in one embodiment, the method comprises 
culturing the host cell of invention (into which a 
recombinant expression vector encoding Tango-77 has been 
introduced) in a suitable medium such that Tango-77 
protein is produced, m another embodiment, the method 
further comprises isolating Tango-77 from the medium or 
the host cell. 

The host cells of the invention can also be used 
to produce nonhuman transgenic animals. For example, in 
one embodiment, a host cell of the invention is a 
fertilized oocyte or an embryonic stem cell into which 
Tango-77-coding sequences have been introduced. Such 
host cells can then be used to create non-human 
transgenic animals in which exogenous Tango-77 sequences 
have been introduced into their genome or homologous 
recombinant animals in which endogenous Tango-77 
sequences have been altered. Such animals are useful for 
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studying the function and/or activity of Tango-77 and for 
identifying and/or evaluating modulators of Tango-77 
activity. As used herein, a "transgenic animal" is a 
non-human animal, preferably a mammal, more preferably a 
rodent such as a rat or mouse, in which one or more of 
the cells of the animal includes a transgene. Other 
examples of transgenic animals include non-human 
primates, sheep, dogs, cows, goats, chickens, amphibians, 
etc A transgene is exogenous DNA which is integrated 
3 into the genome of a cell from which a transgenic animal 
develops and which remains in the genome of the mature 
animal, thereby directing the expression of an encoded 
gene product in one or more cell types or tissues of the 
transgenic animal. As used herein, an "homologous 
5 recombinant animal" is a non-human animal, preferably a 
mammal, more preferably a mouse, in which an endogenous 
Tango-77 gene has been altered by homologous 
recembination between the endogenous gene and an 
exogenous DNA molecule introduced into a cell of the 
2 o animal, e.g., an embryonic cell of the animal, prior to 
development of the animal. 

A transgenic animal of the invention can be 
created by introducing Tango- 77 -encoding nucleic acid 
into the male pronuclei of a fertilized oocyte, e.g., by 
25 microinjection, retroviral infection, and allowing the 
oocyte to develop in a pseudopregnant female foster 
animal. The Tango-77 cDNA sequence e.g., that of (SEQ ID 
NO-1, SEQ ID NO:3, SEQ ID NO:6; SEQ ID NO:10 or the cDNA 
of ATCC 98807) can be introduced as a transgene into the 
30 genome of a non-human animal. Alternatively, a nonhuman 
homologue of the human Tango-77 gene, such as a mouse 
Tango-77 gene, can be isolated based on hybridization to 
the human Tango-77 cDNA and used as a transgene. 
intronic sequences and polyadenylation signals can also 
35 be included in the transgene to increase the efficiency 



WO 99/06426 



PO7US98/16102 



- 52 



of expression of the transgene. A tissue-specific 
regulatory sequence (a) can be operably linked to the 
Tango-77 transgene to direct expression of Tango-77 
protein to particular cells. Methods for generating 

5 transgenic animals via embryo manipulation and 

microinjection, particularly animals such as mice, have 
become conventional in the art and are described, for 
example, in U.S. Patent Nos. 4,736,866 and 4,870,009 
U.S. Patent No. 4,873,191 and in Hogan, Manipulating' the 

> Mouse Enbryo (Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, N.Y., 1986) . Similar methods are used for 
production of other transgenic animals. A transgenic 
founder animal can be identified based upon the presence 
of the Tango-77 transgene in its genome and/or expression 
of Tango-77 mRHA in tissues or cells of the animals. A 
transgenic founder animal can then be used to breed 
additional animals carrying the transgene. Moreover 
transgenic animals carrying a transgene encoding Tango-77 
can further be bred to other transgenic animals carrying 
other transgenes. 

To create an homologous recombinant animal a 
vector is prepared which contains at least a portion of a 
Tango-77 gene (e.g., a human or a non-human homolog of 
the Tango-77 gene, e.g., a murine Tango-77 gene) into 
which a deletion, addition or substitution has been 
introduced to thereby alter, e.g., functionally disrupt 
the Tango-77 gene, in a preferred embodiment, the vector 
is designed such that, upon homologous recombination, the 
endogenous Tango-77 gene is functionally disrupted (i e 
no longer encodes a functional protein; also referred to' 
as a "knock out" vector) . Alternatively, the vector 

can be designed such that, upon homologous recombination 
the endogenous Tango-77 gene is mutated or otherwise 
altered but still encodes functional protein (e.g. the 
upstream regulatory region can be altered to thereby 
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alter the expression of the endogenous Tango-77 protein) . 
in the homologous recombination vector, the altered 
portion of the Tango-77 gene is flanked at its 5' and 3' 
ends by additional nucleic acid of the Tango-77 gene to 
allow for homologous recombination to occur between the 
exogenous Tango-77 gene carried by the vector and an 
endogenous Tango-77 gene in an embryonic stem cell. The 
additional flanking Tango-77 nucleic acid is of 
sufficient length for successful homologous recombination 
with the endogenous gene. Typically, several kilobases 
of flanking DNA (both at the 5' and 3' ends) are included 
in the vector (see, e.g., Thomas and Capecchi (1987) Cell 
51:503 for a description of homologous recombination 
vectors) . The vector is introduced into an embryonic 
stem cell line (e.g., by electroporation) and cells in 
which the introduced Tango-77 gene has homologously 
recombined with the endogenous Tango-77 gene are selected 
(see, e.g., Li et al. (1992) Cell 69:915). The selected 
cells are then injected into a blastocyst of an animal 
(e.g., a mouse) to form aggregation chimeras (see, e.g., 
Bradley in Teratocarcinomas and Embryonic Stem Cells: A 
Practical Approach, Robertson, ed. (IRL, Oxford, 1987) 
pp. 113-152). A chimeric embryo can then be implanted 
into a suitable pseudopregnant female foster animal and 
the embryo brought to term. Progeny harboring the 
homologously recombined DNA in their germ cells can be 
used to breed animals in which all cells of the animal 
contain the homologously recombined DNA by germline 
transmission of the transgene. Methods for constructing 
3 homologous recombination vectors and homologous 

recombinant animals are described further in Bradley 
(1991) Current Opinion in Bio/Technology 2:823-829 and in 
PCT Publication Nos. WO 90/11354, WO 91/01140, WO 
92/0968, and WO 93/04169. 
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In another embodiment, transgenic non-human 
animals can be produced which contain selected systems 
which allow for regulated expression of the transgene. 
One example of such a system is the cre/loxP recombinase 
5 system of bacteriophage Pi. For a description of the 
cre/loxP recombinase system, see, e.g., Lakso et al . 
(1992) Proc. Natl. Acad. Sci . USA 89:6232-6236. Another 
example of a recombinase system is the FLP recombinase 
system of Saccharomyces cerevisiae (0' Gorman et al. 
io (1991) Science 251:1351-1355. If a cre/loxP recombinase 
system is used to regulate expression of the transgene, 
animals containing transgenes encoding both the Cre 
recombinase and a selected protein are required. Such 
animals can be provided through the construction of 
is "double" transgenic animals, e.g., by mating two 

transgenic animals, one containing a transgene encoding a 
selected protein and the other containing a transgene 
encoding a recombinase. 

Clones of the non-human transgenic animals 
20 described herein can also be produced according to the 
methods described in Wilmut et al. (1997) Nature 385:810- 
813 and PCT Publication Nos. WO 97/07668 and WO 97/07669. 
In brief, a cell, e.g., a somatic cell, from the 
transgenic animal can be isolated and induced to exit the 
25 growth cycle and enter G 0 phase. The quiescent cell can 
then be fused, e.g., through the use of electrical 
pulses, to an enucleated oocyte from an animal of the 
same species from which the quiescent cell is isolated. 
The reconstructed oocyte is then cultured such that it 
30 develops to morula or blastocyte and then transferred to 
pseudopregnant female foster animal. The offspring borne 
of this female foster animal will be a clone of the 
animal from which the cell, e.g., the somatic cell, is 
isolated. 
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TV. Pharmaceutical Comp ositions 

The Tango- 77 nucleic acid molecules, Tango- 77 
proteins, and anti-Tango-77 antibodies (also referred to 
herein as "active compounds") of the invention can be 
incorporated into pharmaceutical compositions suitable 
for administration. Such compositions typically comprise 
the nucleic acid molecule, protein, or antibody and a 
pharmaceutical^ acceptable carrier. As used herein the 
language '•pharmaceutically acceptable carrier" is 
intended to include any and all solvents, dispersion 
media, coatings, antibacterial and antifungal agents, 
isotonic and absorption delaying agents, and the like, 
compatible with pharmaceutical administration. The use 
of such media and agents for pharmaceutically active 
substances is well known in the art. Except insofar as 
any conventional media or agent is incompatible with the 
active compound, use thereof in the compositions is 
contemplated. Supplementary active compounds can also be 
incorporated into the compositions. 

A pharmaceutical composition of the invention is 
formulated to be compatible with its intended route of 
administration. Examples of routes of administration 
include parenteral, (e.g. intravenous, intradermal, 
subcutaneous) (e.g., oral inhalation), transdermal 
(topical), transmucosal, and rectal administration. 
Solutions or suspensions used for parenteral, 
intradermal, or subcutaneous application can include the 
following components: a sterile diluent such as water for 
injection, saline solution, fixed oils, polyethylene 
glycols, glycerine, propylene glycol or other synthetic 
solvents; antibacterial agents such as benzyl alcohol or 
methyl parabens; antioxidants such as ascorbic acid or 
sodium bisulfite; chelating agents such as 
ethylenediaminetetraacetic acid; buffers such as 
35 acetates, citrates or phosphates and agents for the 
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adjustment of tonicity such as sodium chloride or 
dextrose. p H can be adjusted with acids or bases, such 
as hydrochloric acid or sodium hydroxide. The parenteral 
preparation can be enclosed in ampoules, disposable 
syringes or multiple dose vials made of glass or plastic. 

Pharmaceutical compositions suitable for 
injectable use include sterile aqueous solutions (where 
water soluble) or dispersions and sterile powders for the 
extemporaneous preparation of sterile injectable 
solutions or dispersions. For intravenous 
administration, suitable carriers include physiological 
saline, bacteriostatic water, Cremophor EL™ (BASF; 
Parsippany, NJ) or phosphate buffered saline (PBS) . m 
all cases, the composition must be sterile and should be 
fluid to the extent that easy syringability exists. It 
must be stable under the conditions of manufacture and 
storage and must be preserved against the contaminating 
action of microorganisms such as bacteria and fungi. The 
carrier can be a solvent or dispersion medium containing, 
for example, water, ethanol, polyol (for example, 
glycerol, propylene glycol, and liquid polyetheylene 
glycol, and the like), and suitable mixtures thereof. 
The proper fluidity can be maintained, for example, by 
the use of a coating such as lecithin, by the maintenance 
of the required particle size in the case of dispersion 
and by the use of surfactants. Prevention of the action 
of microorganisms can be achieved by various 
antibacterial and antifungal agents, for example, 
parabens, chlorobutanol , phenol, ascorbic acid, 
thimerosal, and the like. In many cases, it will be 
preferable to include isotonic agents, for example, 
sugars, polyalcohols such as mannitol, sorbitol, sodium 
chloride in the composition. Prolonged absorption of the 
injectable compositions can be brought about by including 
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in the composition an agent which delays absorption, for 
example, aluminum monostearate and gelatin. 

Sterile injectable solutions can be prepared by 
incorporating the active compound (e.g., a Tango-77 
protein or anti-Tango-77 antibody) in the required amount 
in an appropriate solvent with one or a combination of 
ingredients enumerated above, as required, followed by 
filtered sterilization. Generally, dispersions are 
prepared by incorporating the active compound into a 
sterile vehicle which contains a basic dispersion medium 
and the required other ingredients from those enumerated 
above. In the case of sterile powders for the 
preparation of sterile injectable solutions, the 
preferred methods of preparation are vacuum drying and 
freeze-drying which yields a powder of the active 
ingredient plus any additional desired ingredient from a 
previously sterile-filtered solution thereof. 

Oral compositions generally include an inert 
diluent or an edible carrier. They can be enclosed in 
gelatin capsules or compressed into tablets. For the 
purpose of oral therapeutic administration, the active 
compound can be incorporated with excipients and used in 
the form of tablets, troches, or capsules. Oral 
compositions can also be prepared using a fluid carrier 
for use as a mouthwash, wherein the compound in the fluid 
carrier is applied orally and swished and expectorated or 
swallowed. Pharmaceutical^ compatible binding agents, 
and/or adjuvant materials can be included as part of the 
composition. The tablets, pills, capsules, troches and 
the like can contain any of the following ingredients, or 
compounds of a similar nature: a binder such as 
microcrystalline cellulose, gum tragacanth or gelatin; an 
excipient such as starch or lactose, a disintegrating 
agent such as alginic acid, Primogel, or corn starch; a 
35 lubricant such as magnesium stearate or Sterotes; a 



30 



WO 99/06426 



PCT7US98/16102 



58 



glidant such as colloidal silicon dioxide; a sweetening 
agent such as sucrose or saccharin; or a flavoring agent 
such as peppermint, methyl salicylate, or orange 
flavoring. 

For administration by inhalation, the compounds 
are delivered in the form of an aerosol spray from a 
pressurized container or dispenser which contains a 
suitable propellant, e.g., a gas such as carbon dioxide, 
or a nebulizer. 

Systemic administration can also be by 
transmucosal or transdermal means. For transmucosal or 
transdermal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. 
Such penetrants are generally known in the art, and 
include, for example, for transmucosal administration, 
detergents, bile salts, and fusidic acid derivatives. 
Transmucosal administration can be accomplished through 
the use of nasal sprays or suppositories. For 
transdermal administration, the active compounds are 
formulated into ointments, salves, gels, or creams as 
generally known in the art. 

The compounds can also be prepared in the form of 
suppositories (e.g., with conventional suppository bases 
such as cocoa butter and other glycerides) or retention 
enemas for rectal delivery. 

in one embodiment, the active compounds are 
prepared with carriers that will protect the compound 
against rapid elimination from the body, such as a 
controlled release formulation, including implants and 
microencapsulated delivery systems. Biodegradable, 
biocompatible polymers can be used, such as ethylene 
vinyl acetate, polyanhydrides , polyglycolic acid, 
collagen, polyorthoesters, and polylactic acid. Methods 
for preparation of such formulations will be apparent to 
those skilled in the art. The materials can also be 
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obtained commercially from Alza Corporation and Nova 
Pharmaceuticals, Inc. Liposomal suspensions (including 
liposomes targeted to infected cells with monoclonal 
antibodies to viral antigens) can also be used as 
pharmaceutical ly acceptable carriers. These can be 
prepared according to methods known to those skilled in 
the art, for example, as described in U.S. Patent No. 
4, 522, 811. 

It is especially advantageous to formulate oral or 
> parenteral compositions in dosage unit form for ease of 
administration and uniformity of dosage. Dosage unit 
form as used herein refers to physically discrete units 
suited as unitary dosages for the subject to be treated; 
each unit containing a predetermined quantity of active 
s compound calculated to produce the desired therapeutic 
effect in association with the required pharmaceutical 
carrier. The specification for the dosage unit forms of 
the invention are dictated by and directly dependent on 
the unique characteristics of the active compound and the 
o particular therapeutic effect to be achieved, and the 
limitations inherent in the art of compounding such an 
active compound for the treatment of individuals. 

The nucleic acid molecules of the invention can be 
inserted into vectors and used as gene therapy vectors. 
25 Gene therapy vectors can be delivered to a subject by, 
for example, intravenous injection, local administration 
(U.S. Patent 5,328,470) or by stereotactic injection 
(see, e.g., Chen et al . (1994) Proc. Natl. Acad. Sci. USA 
91:3054-3057). The pharmaceutical preparation of the 
3 0 gene therapy vector can include the gene therapy vector 
in an acceptable diluent, or can comprise a slow release 
matrix in which the gene delivery vehicle is imbedded. 
Alternatively, where the complete gene delivery vector 
can be produced intact from recombinant cells, e.g. 
35 retroviral vectors, the pharmaceutical preparation can 
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include one or more cells which produce the gene delivery 

system. 

The pharmaceutical compositions can be included in 
a container, pack, or dispenser together with 
5 instructions for administration. 

^ Uses and Methods n f the inwrn-i™ 

The nucleic acid molecules, proteins, protein 
homologues, and antibodies described herein can be used 
in one or more of the following methods: a) screening 

io assays; b) detection assays (e.g., chromosomal mapping, 
tissue typing, forensic biology) ; c) predictive medicine 
(e.g., diagnostic assays, prognostic assays, monitoring 
clinical trials, and pharmacogenomics) ; and d) methods of 
treatment (e.g., therapeutic and prophylactic) . A 

is Tango- 77 protein interacts with other cellular proteins 
and can thus be used for regulation of inflammation. The 
polypeptides of the invention can be used in assays to 
determine biological activity. For example, they could 
be used in a panel of proteins for high- throughput 

20 screening. 

The isolated nucleic acid molecules of the 
invention can be used to express Tango-77 protein (e.g., 
via a recombinant expression vector in a host cell in 
gene therapy applications) , to detect Tango-77 mRNA 

25 (e.g., in a biological sample) or a genetic lesion in a 
Tango-77 gene, and to modulate Tango-77 activity, in 
addition, the Tango-77 proteins can be used to screen 
drugs or compounds which modulate the Tango-77 activity 
or expression as well as to treat disorders characterized 

30 by insufficient or excessive production of Tango-77 
protein or production of Tango-77 protein forms which 
have decreased or aberrant activity compared to Tango-77 
wild type protein. In addition, the anti-Tango-77 
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antibodies of the invention can be used to detect and 
isolate Tango-77 proteins and modulate Tango- 77 activity. 

This invention further pertains to novel agents 
identified by the above -described screening assays and 
5 uses thereof for treatments as described herein. 

A. Screening Assays 

The invention provides a method (also referred to 
herein as a "screening assay") for identifying 
modulators, i.e., candidate or test compounds or agents 
lo (e.g., peptides, peptidomimetics , small molecules or 
other drugs) which bind to Tango-77 proteins or have a 
stimulatory or inhibitory effect on, for example, 
Tango-77 expression or Tango-77 activity. 

Examples of methods for the synthesis of molecular 
is libraries can be found in the art, for example in: 

DeWitt et al. (1993) Proc. Natl. Acad. Sci. USA 90:6909; 
Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; 
Zuckermann et al . (1994). J. Med. Chew. 37:2678; Cho et 
al. (1993) Science 261:1303; Carrell et al . (1994) Angew. 
20 Chem. Int. Ed. Engl. 33:2059; Carell et al . (1994) Angew. 
Chem. Int. Ed. Engl. 33:2061; and Gallop et al . (1994) J. 
Med. Chem. 37:1233. 

Libraries of compounds may be presented in 
solution (e.g., Houghten (1992) Bio/Techniques 13:412- 
25 421), or on beads (Lam (1991) Nature 354:82-84), chips 
(Fodor (1993) Nature 364:555-556), bacteria (U.S. Patent 
No. 5,223,409), spores (Patent Nos. 5,571,698; 5,403,484; 
and 5,223,409), plasmids (Cull et al . (1992) Proc. Natl. 
Acad. Sci. USA 89:1865-1869) or phage (Scott and Smith 
30 (1990) Science 249:386-390; Devlin (1990) Science 

249:404-406; Cwirla et al . (1990) Proc. Natl. Acad. Sci. 
USA 87:6378-6382; and Felici (1991) J. Mol . Biol. 
222 :301-310) . 
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in another embodiment, an assay is used to 
determine the ability of the test compound to modulate 
the activity of Tango-77 or a biologically active portion 
thereof, for example, by determining the ability of the 
5 Tango-77 protein to bind to or interact with a Tango-77 
target molecule. As used herein, a "target molecule" is 
a molecule with which a Tango-77 protein binds or 
interacts in nature, for example, a molecule on the 
surface of a cell, a Tango-77 target molecule can be a 
io non-Tango- 77 molecule or a Tango-77 protein or 

polypeptide of the present invention. In one embodiment, 
a Tango-77 target molecule is a component of a signal 
transduction pathway, for example, Tango-77 may bind to a 
IL-1 receptor or another receptor thereby blocking the 
is receptor and inhibiting future signal transduction. 

Determining the ability of the Tango-77 protein to bind 
to or interact with a Tango-77 target molecule can be 
accomplished by one of the methods described above. In a 
preferred embodiment, determining the ability of the 
20 Tango-77 protein to bind to or interact with a Tango-77 
target molecule can be accomplished by determining the 
activity of the target molecule. For example, the 
activity of the target molecule can be determined by 
detecting induction of a cellular second messenger of the 
25 target (e.g., intracellular Ca'\ diacylglycerol, IP3, 
etc.), detecting catalytic/enzymatic activity of the 
target on an appropriate substrate, detecting the 
induction of a reporter gene (e.g., a Tango- 77 -responsive 
regulatory element operably linked to a nucleic acid 
30 encoding a detectable marker, e.g. luciferase) , or 

detecting a cellular response, for example, inflammation. 

In yet another embodiment, an assay of the present 
invention is a cell-free assay comprising contacting a 
Tango-77 protein or biologically active portion thereof 
35 with a test compound and determining the ability of the 
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test compound to bind to the Tango-77 protein or 
biologically active portion thereof. Binding of the test 
compound to the Tango-77 protein can be determined either 
directly or indirectly as described above. In a 
5 preferred embodiment, the assay includes contacting the 
Tango-77 protein or biologically active portion thereof 
with a known compound which binds Tango-77 to form an 
assay mixture, contacting the assay mixture with a test 
compound, and determining the ability of the test 
io compound to interact with a Tango-77 protein, wherein 
determining the ability of the test compound to interact 
with a Tango-77 protein comprises determining the ability 
of the test compound to preferentially bind to Tango-77 
or biologically active portion thereof as compared to the 

15 known compound. 

In another embodiment, an assay is a cell-free 
assay comprising contacting Tango-77 protein or 
biologically active portion thereof with a test compound 
and determining the ability of the test compound to 
20 modulate (e.g., stimulate or inhibit) the activity of the 
Tango-77 protein or biologically active portion thereof. 
Determining the ability of the test compound to modulate 
the activity of Tango-77 can be accomplished, for 
example, by determining the ability of the Tango-77 
25 protein to bind to a Tango-77 target molecule by one of 
the methods described above for determining direct 
binding. In an alternative embodiment, determining the 
ability of the test compound to modulate the activity of 
Tango-77 can be accomplished by determining the ability 
30 of the Tango-77 protein to further modulate a Tango-77 
target molecule. For example, the catalytic/enzymatic 
activity of the target molecule on an appropriate 
substrate can be determined as previously described. 

in yet another embodiment, the cell-free assay 

t ,^,. n -u- Tanao-77 protein or biologically 
35 comprises contacting tne lango / / yj-w 
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active portion thereof with a known compound which binds 
Tango-77 to form an assay mixture, contacting the assay 
mixture with a test compound, and determining the ability 
of the test compound to interact with a Tango-77 protein 
wherein determining the ability of the test compound to ' 
interact with a Tango-77 protein comprises determining 
the ability of the Tango-77 protein to preferentially 
bind to or modulate the activity of a Tango-77 target 
molecule . 

It is possible that membrane -bound forms of Tango- 
77 exist. The cell-free assays of the present invention 
are amenable to use of both the forms Tango-77. in the 
case of cell-free assays comprising a membrane -bound form 
of Tango-77, it may be desirable to utilize a 
solubilizing agent such that the membrane -bound form of 
Tango-77 is maintained in solution. Examples of such 
solubilizing agents include non- ionic detergents such as 
n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, 
octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, 
Triton® X-100, Triton* X-114, Thesit®, 
Isotridecypoly (ethylene glycol ether) n, 3- [(3- 
cholamidopropyDdimethylamminio] -1-propane sulfonate 
(CHAPS) , 3- [ (3-cholamidopropyl)dimethylamminio] -2- 
hydroxy- 1-propane sulfonate (CHAPSO) , or N-dodecyl=N,N- 
25 dimethyl- 3 -ammonio- 1-propane sulfonate. 

In more than one embodiment of the above assay 
methods of the present invention, it may be desirable to 
immobilize either Tango-77 or its target molecule to 
facilitate separation of complexed from uncomplexed forms 
30 of one or both of the proteins, as well as to accommodate 
automation of the assay. Binding of a test compound to 
Tango-77, or interaction of Tango-77 with a target 
molecule in the presence and absence of a candidate 
compound, can be accomplished in any vessel suitable for 
35 containing the reactants. Examples of such vessels 
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include microtitre plates, test tubes, and micro- 
centrifuge tubes. In one embodiment, a fusion protein 
can be provided which adds a domain that allows one or 
both of the proteins to be bound to a matrix. For 
example, glutathione-S- transferase/ Tango-77 fusion 
proteins or glutathione-S-transf erase/target fusion 
proteins can be adsorbed onto glutathione sepharose beads 
(Sigma Chemical Co.; St. Louis, MO) or glutathione 
derivatized microtitre plates, which are then combined 
with the test compound or the test compound and either 
the non- adsorbed target protein or Tango-77 protein, and 
the mixture incubated under conditions conducive to 
complex formation (e.g., at physiological conditions for 
salt and pH) . Following incubation, the beads or 
microtitre plate wells are washed to remove any unbound 
components and complex formation is measured either 
directly or indirectly, for example, as described above. 
Alternatively, the complexes can be dissociated from the 
matrix, and the level of Tango-77 binding or activity 
determined using standard techniques. 

Other techniques for immobilizing proteins on 
matrices can also be used in the screening assays of the 
invention. For example, either Tango-77 or its target 

lecule can be immobilized utilizing conjugation of 
biotin and streptavidin. Biotinylated Tango-77 or target 
molecules can be prepared from biotin-NHS (N-hydroxy- 
succinimide) using techniques well known in the art 
(e g , biotinylation kit, Pierce Chemicals; Rockford, 
IL) , and immobilized in the wells of streptavidin-coated 
96 well plates (Pierce Chemical) . Alternatively, 
antibodies reactive with Tango-77 or target molecules but 
which do not interfere with binding of the Tango-77 
protein to its target molecule can be derivatized to the 
wells of the plate, and unbound target or Tango-77 
trapped in the wells by antibody conjugation. Methods 



mo 
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for detecting such complexes, in addition to those 
described above for the GST -immobilized complexes 
include immunodetection of complexes using antibodies 
reactive with the Tango-77 or target molecule, as well as 
; enzyme-linked assays which rely on detecting an enzymatic 
activity associated with the Tango-77 or target molecule 

In another embodiment, modulators of Tango-77 
expression are identified in a method in which a cell is 
contacted with a candidate compound and the expression of 
Tango-77 mRNA or protein in the cell is determined. The 
level of expression of Tango-77 mRNA or protein in the 
presence of the candidate compound is compared to the 
level of expression of Tango-77 mRNA or protein in the 
absence of the candidate compound. The candidate 
compound can then be identified as a modulator of 
Tango-77 expression based on this comparison. For 
example, when expression of Tango-77 mRNA or protein is 
greater (statistically significantly greater) in the 
presence of the candidate compound than in its absence, 
the candidate compound is identified as a stimulator of 
Tango-77 mRNA or protein expression. Alternatively, when 
expression of Tango-77 mRNA or protein is less 
(statistically significantly less) in the presence of the 
candidate compound than in its absence, the candidate 
compound is identified as an inhibitor of Tango-77 mRNA 
or protein expression. The level of Tango-77 mRNA or 
protein expression in the cells can be determined by 
methods described herein for detecting Tango-77 mRNA or 
protein. 

In yet another aspect of the invention, the 
Tango-77 proteins can be used as "bait proteins" in a 
two-hybrid assay or three hybrid assay (see, e.g., u.S 
Patent No. 5,283,317; Zervos et al. (1993) Cell 72:223- 
232; Madura et al . (1993) jr. Biol. Chem. 268:12046-12054- 
Bartel et al . (1993) Bio/Technigues 14:920-924; Iwabuchi 
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et al. (1993) Oncogene 8:1693-1696; and PCT Publication 
No. WO 94/10300) , to identify other proteins, which bind 
to or interact with Tango-77 ( "Tango-77-binding proteins" 
or "Tango-77-bp" ) and modulate Tango-77 activity. Such 
Tango- 77 -binding proteins are also likely to be involved 
in the propagation of signals by the Tango-77 proteins 
as, for example, upstream or downstream elements of the 
Tango-77 pathway. 

The two-hybrid system is based on the modular 
nature of most transcription factors, which consist of 
separable DNA-binding and activation domains. Briefly, 
the assay utilizes two different DNA constructs. In one 
construct, the gene that codes for Tango-77 is fused to a 
gene encoding the DNA binding domain of a known 
transcription factor (e.g., GAL-4) . In the other 
construct, a DNA sequence, from a library of DNA 
sequences, that encodes an unidentified protein ("prey" 
or "sample") is fused to a gene that codes for the 
activation domain of the known transcription factor. If 
the "bait" and the "prey" proteins are able to interact, 
in vivo, forming an Tango- 77 -dependent complex, the DNA- 
binding and activation domains of the transcription 
factor are brought into close proximity. This proximity 
allows transcription of a reporter gene (e.g., LacZ) 
which is operably linked to a transcriptional regulatory 
site responsive to the transcription factor. Expression 
of the reporter gene can be detected and cell colonies 
containing the functional transcription factor can be 
isolated and used to obtain the cloned gene which encodes 
5 the protein which interacts with Tango-77. 

This invention further pertains to novel agents 
identified by the above-described screening assays and 
uses thereof for treatments as described herein. 
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B • Detecti on Assays 

Portions or fragments of the cDNA sequence 
identified herein (and the corresponding complete gene 
sequences) can be used in numerous ways as polynucleotide 
reagents. For example, the sequence can be used to: (i) 
map the respective gene on a chromosome and, thus, locate 
gene regions associated with genetic disease; (ii) 
identify an individual from a minute biological sample 
(tissue typing) ; and (iii) aid in forensic identification 
of a biological sample. These applications are described 
in the subsections below. 



1 . Chromosome Mapping 

Once the sequence (or a portion of the sequence) 
of a gene has been isolated, this sequence can be used to 

is map the location of the gene on a chromosome. 

Accordingly, Tango- 7 7 nucleic acid molecules described 
herein or fragments thereof, can be used to map the 
location of the Tango-77 gene(s) on a chromosome. The 
mapping of the Tango-77 sequences to chromosomes is an 

20 important first step in correlating these sequences with 
genes associated with disease. 

Briefly, a Tango-77 gene can be mapped to 
chromosomes by preparing PCR primers (preferably 15-25 bp 
in length) from the Tango-77 sequences. Computer 

25 analysis of Tango-77 sequences can be used to rapidly 

select primers that do not span more than one exon in the 
genomic DNA, thus complicating the amplification process. 
These primers can then be used for PCR screening of 
somatic cell hybrids containing individual human 

30 chromosomes. Only those hybrids containing the human 
gene corresponding to the Tango-77 sequences will yield 
an amplified fragment. 

Somatic cell hybrids are prepared by fusing 
somatic cells from different mammals (e.g., human and 
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mouse cells) . As hybrids of human and mouse cells grow 
and divide, they gradually lose human chromosomes in 
random order, but retain the mouse chromosomes. By using 
media in which mouse cells cannot grow (because they lack 
a particular enzyme) but in which human cells can, the 
one human chromosome that contains the gene encoding the 
needed enzyme, will be retained. By using various media, 
panels of hybrid cell lines can be established. Each 
cell line in a panel contains either a single human 
chromosome or a small number of human chromosomes, and a 
full set of mouse chromosomes, allowing easy mapping of 
individual genes to specific human chromosomes. 
(D'Eustachio et al. (1983) Science 220 : 919-924) . Somatic 
cell hybrids containing only fragments of human 
chromosomes can also be produced by using human 
chromosomes with translocations and deletions. 

PCR mapping of somatic cell hybrids is a rapid 
procedure for assigning a particular sequence to a 
particular chromosome. Three or more sequences can be 
assigned per day using a single thermal cycler. Using 
the Tango-77 sequences to design oligonucleotide primers, 
sublocalization can be achieved with panels of fragments 
from specific chromosomes. Other mapping strategies 
which can similarly be used to map a Tango-77 sequence to 
5 its chromosome include in situ hybridization (described 
in F an et al . (1990) Proc. Natl. Acad. Sci . USA 87:6223- 
27), pre-screening with labeled flow-sorted chromosomes, 
and' ore-selection by hybridization to chromosome specific 
cDNA libraries. 

o Fluorescence in situ hybridization (FISH) of a DNA 

sequence to a metaphase chromosomal spread can further be 
used to provide a precise chromosomal location in one 
step. Chromosome spreads can be made using cells whose 
division has been blocked in metaphase by a chemical, 



35 e.g. 



colcemid that disrupts the mitotic spindle. The 
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chromosomes can be treated briefly with trypsin, and then 
stained with Giemsa. A pattern of light and dark bands 
develops on each chromosome, so that the chromosomes can 
be identified individually. The FISH technique can be 
used with a DNA sequence as short as 500 or 600 bases 
However, clones larger than 1,000 bases have a higher 
likelihood of binding to a unique chromosomal location 
with sufficient signal intensity for simple detection. 
Preferably 1,000 bases, and more preferably 2,000 bases 
will suffice to get good results at a reasonable amount 
of time. For a review of this technique, see Verma et 
al. (Human Chromosomes: A Manual of Basic Techniques 
(Pergamon Press, New York, 1988)). 

Reagents for chromosome mapping can be used 
individually to mark a single chromosome or a single site 
on that chromosome, or panels of reagents can be used for 
marking multiple sites and/or multiple chromosomes. 
Reagents corresponding to noncoding regions of the genes 
actually are preferred for mapping purposes. Coding 
sequences are more likely to be conserved within gene 
families, thus increasing the chance of cross 
hybridizations during chromosomal mapping. 

Once a sequence has been mapped to a precise 
chromosomal location, the physical position of the 
2S sequence on the chromosome can be correlated with genetic 
map data. (Such data are found, for example, in V. 
McKusick, Mendelian Inheritance in Man, available on-line 
through Johns Hopkins University Welch Medical Library) . 
The relationship between genes and disease, mapped to the 
same chromosomal region, can then be identified through 
linkage analysis (co- inheritance of physically adjacent 
genes), described in, e.g., Egeland et al . (1987) Nature 
325:783-787. 

Moreover, differences in the DNA sequences between 
35 individuals affected and unaffected with a disease 
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associated with the Tango-77 gene can be determined. If 
a mutation is observed in some or all of the affected 
individuals but not in any unaffected individuals, then 
the mutation is likely to be the causative agent of the 
5 particular disease. Comparison of affected and 

unaffected individuals generally involves first looking 
for structural alterations in the chromosomes such as 
deletions or translocations that are visible from 
chromosome spreads or detectable using PCR based on that 
10 DNA sequence. Ultimately, complete sequencing of genes 
from several individuals can be performed to confirm the 
presence of a mutation and to distinguish mutations from 
polymorphisms . 

?. . Tissue Typing 
is The Tango-77 sequences of the present invention 

can also be used to identify individuals from minute 
biological samples. The United States military, for 
example, is considering the use of restriction fragment 
length polymorphism (RFLP) for identification of its 
2 o personnel. In this technique, an individual's genomic 
DNA is digested with one or more restriction enzymes, and 
probed on a Southern blot to yield unique bands for 
identification. This method does not suffer from the 
current limitations of "Dog Tags" which can be lost, 
25 switched, or stolen, making positive identification 
difficult. The sequences of the present invention are 
useful as additional DNA markers for RFLP (described in 
U.S. Patent 5,272,057) . 

Furthermore, the sequences of the present 
30 invention can be used to provide an alternative technique 
which determines the actual base-by-base DNA sequence of 
selected portions of an individual's genome. Thus, the 
Tango-77 sequences described herein can be used to 
prepare two PCR primers from the 5' and 3' ends of the 
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sequences. These primers can then be used to amplify an 
individual's DNA and subsequently sequence it. 

Panels of corresponding DNA sequences from 
individuals, prepared in this manner, can provide unique 
individual identifications, as each individual will have 
a unique set of such DNA sequences due to allelic 
differences. The sequences of the present invention can 
be used to obtain such identification sequences from 
individuals and from tissue. The Tango-77 sequences of 
the invention uniquely represent portions of the human 
genome. Allelic variation occurs to some degree in the 
coding regions of these sequences, and to a greater 
degree in the noncoding regions. It is estimated that 
allelic variation between individual humans occurs with a 
frequency of about once per each 500 bases. Each of the 
sequences described herein can, to some degree, be used 
as a standard against which DNA from an individual can be 
compared for identification purposes. Because greater 
numbers of polymorphisms occur in the noncoding regions, 
fewer sequences are necessary to differentiate 
individuals. The noncoding sequences of SEQ ID N0:1 can 
comfortably provide positive individual identification 
with a panel of perhaps 10 to 1,000 primers which each 
yield a noncoding amplified sequence of 100 bases. if 
predicted coding sequences, such as those in SEQ ID NO: 3, 
SEQ ID NO: 6, or SEQ ID NO: 10 are used, a more appropriate 
number of primers for positive individual identification 
would be 500-2,000. 

If a panel of reagents from Tango-77 sequences 
described herein is used to generate a unique 
identification database for an individual, those same 
reagents can later be used to identify tissue from that 
individual. Using the unique identification database, 
positive identification of the individual, living or 
dead, can be made from extremely small tissue samples. 
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-k. nae of Partial Sequences in Forensic 

Ri oloqv 

DNA-based identification techniques can also be 
used in forensic biology. Forensic biology is a 
, scientific field employing genetic typing of biological 
evidence found at a crime scene as a means for positively 
identifying, for example, a perpetrator of a crime. To 
make such an identification, PCR technology can be used 
to amplify DNA sequences taken from very small biological 
o samples such as tissues, e.g., hair or skin, or body 
'luids, e.g., blood, saliva, or semen found at a crime 
scene ' The amplified sequence can then be compared to a 
standard, thereby allowing identification of the origin 
of the biological sample. 
- 5 The sequences of the present invention can be used 

to provide polynucleotide reagents, e.g., PGR primers, 
targeted to specific loci in the human genome, which can 
enhance the reliability of DNA-based forensic 
identifications by, for example, providing another 
, 0 "identification marker- (i.e. another DNA sequence that 
is unique to a particular individual) . As mentioned 
above, actual base sequence information can be used for 
identification as an accurate alternative to patterns 
formed by restriction enzyme generated fragments. 
25 sequences targeted to noncoding regions of SEQ ID NO:l 
are particularly appropriate for this use as greater 
numbers of polymorphisms occur in the noncoding regions, 
making it easier to differentiate individuals using tnis 
technique. Examples of polynucleotide reagents include 
30 the Tango-77 sequences or portions thereof, e.g., 

fragments derived from the noncoding regions of SEQ ID 
NO:l having a length of at least 20 or 30 bases. 

The Tango-77 sequences described herein can 
further be used to provide polynucleotide reagents, e.g., 
, 5 labeled or labelable probes which can be used m, for 
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example, an in situ hybridization technique, to identify 
a specific tissue, e.g., brain tissue. This can be very 
useful in cases where a forensic pathologist is presented 
with a tissue of unknown origin. Panels of such Tango-77 
5 probes can be used to identify tissue by species and/or 
by organ type . 

In a similar fashion, these reagents, e.g., 
Tango-77 primers or probes can be used to screen tissue 
culture for contamination (i.e., screen for the presence 
o of a mixture of different types of cells in a culture) . 

C Predictive Medicine 

The present invention also pertains to the field 
of predictive medicine in which diagnostic assays, 
prognostic assays, pharmacogenomics , and monitoring 
clinical trails are used for prognostic (predictive) 
purposes to thereby treat an individual prophylactically . 
Accordingly, one aspect of the present invention relates 
to diagnostic assays for determining Tango-77 protein 
and/or nucleic acid expression as well as Tango-77 
activity, in the context of a biological sample (e.g., 
blood, serum, cells, tissue) to thereby determine whether 
an individual is afflicted with a disease or disorder, or 
is at risk of developing a disorder, associated with 
aberrant Tango-77 expression or activity. The invention 
also provides for prognostic (or predictive) assays for 
determining whether an individual is at risk of 
developing a disorder associated with Tango-77 protein, 
nucleic acid expression or activity. For example, 
mutations in a Tango-77 gene can be assayed in a 
biological sample. Such assays can be used for 
prognostic or predictive purpose to thereby 
prophylactically treat an individual prior to the onset 
of a disorder characterized by or associated with 
Tango-77 protein, nucleic acid expression or activity. 
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Another aspect of the invention provides methods 
<or determining Tango-77 protein, nucleic acid expression 
or Tango-77 activity in an individual to thereby select 
aporoonate therapeutic or prophylactic agents for that 

c individual (referred to herein as "pharmacogenomics" ) . 
Pharmacogenomics allows for the selection of agents 
(e g drugs) for therapeutic or prophylactic treatment 
of an' individual based on the genotype of the individual 
, eg the genotype of the individual examined to 

10 determine the ability of the individual to respond to a 

particular agent.) 

Yet another aspect of the invention pertains to 
m onitonng the influence of agents (e.g., drugs or other 
compounds) on the expression or activity of Tango-77 m 

-5 clinical trials. 

These and other agents are described in further 

detail in the following sections, 
i ni acmost j <~ Assays 

In exemplary method tor detecting the presence or 
M absence of Ta„go-77 In a biological sample 

obtaining a biological sample from a test sublet and 
contacting the biological sample with a compound or an 
ag ent capable o £ detecting Tango-77 protein or -«k 
acid (e.g.. mRltt. genomic DNA) that encodes Tango-" 
25 protein such that the presence of Tango-77 is » 
the biological sample. A preferred agent for detecting 
Ta „go-77 mRKA or genomic DNA is a labeled nucleic acid 
probe capable of hybridizing to Tango-77 *» or genomi 
DNA. The nucleic acid probe can be. for example, a full- 
length Tango-77 nucleic acid, such as the nucleic acid of 
SEQ ID NO: 1 . SEQ ID »:3. SEO ID ».«. SEQ ID «.: . « • 
oortion thereof, such as an oligonucleotide of at least 
" 15 30, 50. 100, 250 or 500 nucleotides in length and 
sufficient to specifically hybridize under stringent 
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conditions to Tango-77 mRNA or genomic DNA. Other 
suitable probes for use in the diagnostic assays of the 
invention are described herein. 

A preferred agent for detecting Tango-77 protein 
is an antibody capable of binding to Tango-77 protein, 
preferably an antibody with a detectable label. 
Antibodies can be polyclonal, or more preferably, 
monoclonal. An intact antibody, or a fragment thereof 
(e.g., Fab or F(ab') 2 ) can be used. The term "labeled", 
with regard to the probe or antibody, is intended to 
encompass direct labeling of the probe or antibody by 
coupling (i.e., physically linking) a detectable 
substance to the probe or antibody, as well as indirect 
labeling of the probe or antibody by reactivity with 
another reagent that is directly labeled. Examples of 
indirect labeling include detection of a primary antibody 
using a f luorescently labeled secondary antibody and end- 
labeling of a DNA probe with biotin such that it can be 
detected with f luorescently labeled streptavidin . The 
o term "biological sample" is intended to include tissues, 
cells and biological fluids isolated from a subject, as 
well as tissues, cells and fluids present within a 
subject. That is, the detection method of the invention 
can be used to detect Tango-77 mRNA, protein, or genomic 
s DNA in a biological sample in vitro as well as in vivo. 
For example, in vitro techniques for detection of 
Tango-77 mRNA include Northern hybridizations and in situ 
hybridizations. In vitro techniques for detection of 
Tango-77 protein include enzyme linked immunosorbent 
o assays (ELISAs) , Western blots, immunoprecipi tat ions and 
immunofluorescence. In vitro techniques for detection of 
Tango-77 genomic DNA include Southern hybridizations. 
Furthermore, in vivo techniques for detection of Tango-77 
protein include introducing into a subject a labeled 
5 anti-Tango-77 antibody. For example, the antibody can be 
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labeled with a radioactive marker whose presence and 
location in a subject can be detected by standard imaging 
techniques . 

In one embodiment, the biological sample contains 
5 protein molecules from the test subject. Alternatively, 
the biological sample can contain mRNA molecules from the 
test subject or genomic DNA molecules from the test 
subject. A preferred biological sample is a peripheral 
blood leukocyte sample isolated by conventional means 

10 from a subject . 

In another embodiment, the methods further involve 
obtaining a control biological sample from a control 
subject, contacting the control sample with a compound or 
agent capable of detecting Tango-77 protein, mRNA, or 
is genomic DNA, such that the presence of Tango-77 protein, 
mRNA or genomic DNA is detected in the biological sample, 
and comparing the presence of Tango-77 protein, mRNA or 
genomic DNA in the control sample with the presence of 
Tango-77 protein, mRNA or genomic DNA in the test sample. 

The invention also encompasses kits for detecting 
the presence of Tango-77 in a biological sample (a test 
sample) . Such kits can be used to determine if a subject 
is suffering from or is at increased risk of developing a 
disorder associated with aberrant expression of Tango-77 
25 (e.g., an immunological disorder). For example, the kit 
can comprise a labeled compound or agent capable of 
detecting Tango-77 protein or mRNA in a biological sample 
and means for determining the amount of Tango-77 in the 
sample (e.g., an anti-Tango-77 antibody or an 
30 oligonucleotide probe which binds to DNA encoding 
Tango-77, e.g., SEQ ID NO-.l or SEQ ID NO: 3 or SEQ ID 
N0:6, or SEQ ID N0:10). Kits may also include 
instruction for observing that the tested subject is 
suffering from or is at risk of developing a disorder 
associated with aberrant expression of Tango-77 if the 
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amount of Tango- 77 protein or mRNA is above or below a 
normal level . 

For antibody-baaed kits, the kit may comprise, for 
example: (1) a first antibody (e.g., attached to a solid 
support) which binds to Tango-77 protein; and, optionally 
(2) a second, different antibody which binds to Tango-77 
protein or the first antibody and is conjugated to a 
detectable agent . 

For oligonucleotide-based kits, the kit may 
comprise, for example: (1) an oligonucleotide, e.g., a 
detectably labelled oligonucleotide, which hybridizes to 
a Tango-77 nucleic acid sequence or (2) a pair of primers 
useful for amplifying a Tango-77 nucleic acid molecule,- 

The kit may also comprise, e.g., a buffering 
agent, a preservative, or a protein stabilizing agent. 
The kit may also comprise components necessary for 
detecting the detectable agent (e.g., an enzyme or a 
substrate) . The kit may also contain a control sample or 
a series of control samples which can be assayed and 
compared to the test sample contained. Each component of 
the kit is usually enclosed within an individual 
container and all of the various containers are within a 
single package along with instructions for observing 
whether the tested subject is suffering from or is at 
risk of developing a disorder associated with aberrant 
expression of Tango-77. 

2 . Prognostic Assays 

The methods described herein can furthermore be 
utilized as diagnostic or prognostic assays to identify 
subjects having or at risk of developing a disease or 
disorder associated with aberrant Tango-77 expression or 
activity. For example, the assays described herein, such 
as the preceding diagnostic assays or the following 
assays, can be utilized to identify a subject having or 
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at risk of developing a disorder associated with aberrant 
expression or activity. Thus, the present invention 
provides a method in which a test sample is obtained from 
a subject and Tango-77 protein or nucleic acid (e.g., 
mRNA, genomic DNA) is detected, wherein the presence of 
Tango-77 protein or nucleic acid is diagnostic for a 
subject having or at risk of developing a disease or 
disorder associated with aberrant Tango-77 expression or 
activity. As used herein, a "test sample" refers to a 
biological sample obtained from a subject of interest. 
For example, a test sample can be a biological fluid 
(e.g., serum), cell sample, or tissue. 

Furthermore, the prognostic assays described 
herein can be used to determine whether a subject can be 
administered an agent (e.g., an agonist, antagonist, 
peptidomimetic, protein, peptide, nucleic acid, small 
molecule, or other drug candidate) to treat a disease or 
disorder associated with aberrant Tango-77 expression or 
activity. For example, such methods can be used to 
determine whether a subject can be effectively treated 
with a specific agent or class of agents (e.g., agents of 
a type which decrease Tango-77 activity) . Thus, the 
present invention provides methods for determining 
whether a subject can be effectively treated with an 
25 agent for a disorder associated with aberrant Tango-77 
expression or activity in which a test sample is obtained 
and Tango-77 protein or nucleic acid is detected (e.g., 
wherein the presence of Tango-77 protein or nucleic acid 
is diagnostic for a subject that can be administered the 
30 agent to treat a disorder associated with aberrant 
Tango-77 expression or activity) . 

The methods of the invention can also be used to 
detect genetic lesions or mutations in a Tango-77 gene, 
thereby determining if a subject with the lesioned gene 
35 is at risk for a disorder characterized by aberrant 
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inflammation. in preferred embodiments, the methods 
include detecting, in a sample of cells from the subject, 
the presence or absence of a genetic lesion or mutation ' 
characterized by at least one of an alteration affecting 
5 the integrity of a gene encoding a Tango- 77-protein, or 
the mis-expression of the Tango-77 gene. For example, 
such genetic lesions or mutations can be detected by 
ascertaining the existence of at least one of: 1) a 
deletion of one or more nucleotides from a Tango-77 gene; 
:o 2) an addition of one or more nucleotides to a Tango-77 
gene; 3) a substitution of one or more nucleotides of a 
Tango-77 gene; 4) a chromosomal rearrangement of a 
Tango-77 gene; 5) an alteration in the level of a 
messenger RNA transcript of a Tango-77 gene; 6) an 
is aberrant modification of a Tango-77 gene, such as of the 
methylation pattern of the genomic DNA; 7) the presence 
of a non-wild type splicing pattern of a messenger RNA 
transcript of a Tango-77 gene; 8) a non-wild type level 
of a Tango-77-protein; 9) an allelic loss of a Tango-77 
20 gene, and 10) an inappropriate post-translational 

modification of a Tango- 77 -protein. As described herein, 
there are a large number of assay techniques known in the 
art which can be used for detecting lesions or mutations 
in a Tango-77 gene. A preferred biological sample is a 
25 peripheral blood leukocyte sample isolated by 
conventional means from a subject. 

In certain embodiments, detection of the lesion 
involves the use of a probe/primer in a polymerase chain 
reaction (PCR) (see, e.g., U.S. Patent Nos . 4,683,195 and 
30 4,683,202), such as anchor PCR or RACE PCR, or, 

alternatively, in a ligation chain reaction (LCR) (see, 
e.g., Landegran et al. (1988) Science 241:1077-1080; and 
Nakazawa et al. (1994) Proc. Natl. Acad. Sci. USA 91:360- 
364), the latter of which can be particularly useful for 
35 detecting point mutations in the Tango-77-gene (see, 
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e.g., Abravaya et al . (1995) Nucleic Acids Res. 23:675- 
682) . This method can include the steps of collecting a 
sample of cells from a patient, isolating nucleic acid 
(e.g.. genomic, mRNA or both) from the cells of the 
s sample, contacting the nucleic acid sample with one or 
more primers which specifically hybridize to a Tango-77 
gene under conditions such that hybridization and 
amplification of the Tango-77-gene (if present) occurs, 
and detecting the presence or absence of an amplification 
10 product, or detecting the size of the amplification 

product and comparing the length to a control sample. It 
is anticipated that PCR and/or LCR may be desirable to 
use as a preliminary amplification step in conjunction 
with any of the techniques used for detecting mutations 

15 described herein. 

Alternative amplification methods include: self 
sustained sequence replication (Guatelli et al . (1990) 
Proc. Natl. Acad. Sci. USA 87 : 1874-1878) , transcriptional 
amplification system (Kwoh, et al . (1989) Proc. Natl. 
Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi 
et al. (1988) Bio/Technology 6:1197), or any other 
nucleic acid amplification method, followed by the 
detection of the amplified molecules using techniques 
well known to those of skill in the art. These detection 
schemes are especially useful for the detection of 
nucleic acid molecules if such molecules are present in 

very low numbers . 

In an alternative embodiment, mutations in a 
Tango-77 gene from a sample cell can be identified by 
30 alterations in restriction enzyme cleavage patterns. For 
example, sample and control DNA is isolated, amplified 
(optionally) , digested with one or more restriction 
endonucleases, and fragment length sizes are determined 
by gel electrophoresis and compared. Differences in 
35 fragment length sizes between sample and control DNA 
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indicates mutations in the sample DNA. Moreover, the use 
of sequence specific ribozymes (see, e.g., U.S. Patent 
No. 5,498,531) can be used to score for the presence of 
specific mutations by development or loss of a ribozyme 
5 cleavage site. 

In other embodiments, genetic mutations in 
Tango- 77 can be identified by hybridizing a sample and 
control nucleic acids, e.g., DNA or RNA, to high density 
arrays containing hundreds or thousands of 

10 oligonucleotides probes (Cronin et al . (1996) Human 

Mutation 7:244-255; Kozal et al . (1996) Nature Medicine 
2:753-759). For example, genetic mutations in Tango-77 
can be identified in two-dimensional arrays containing 
light-generated DNA probes as described in Cronin et al . 

is supra. Briefly, a first hybridization array of probes 
can be used to scan through long stretches of DNA in a 
sample and control to identify base changes between the 
sequences by making linear arrays of sequential 
overlapping probes. This step allows the identification 

20 of point mutations. This step is followed by a second 
hybridization array that allows the characterization of 
specific mutations by using smaller, specialized probe 
arrays complementary to all variants or mutations 
detected. Each mutation array is composed of parallel 

25 probe sets, one complementary to the wild- type gene and 
the other complementary to the mutant gene. 

In yet another embodiment, any of a variety of 
sequencing reactions known in the art can be used to 
directly sequence the Tango-77 gene and detect mutations 

30 by comparing the sequence of the sample Tango-77 with the 
corresponding wild-type (control) sequence. Examples of 
sequencing reactions include those based on techniques 
developed by Maxim and Gilbert ((1977) Proc. Natl. Acad. 
Sci. USA 74:560) or Sanger ((1977) Proc. Natl. Acad. 

35 Sci. USA 74:5463). It is also contemplated that any of a 
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variety of automated sequencing procedures can be 
utilized when performing the diagnostic assays ((1995) 
Bio/Techniques 19:448), including sequencing by mass 
spectrometry (see, e.g., PCT Publication No. WO 94/16101; 
Cohen et al . (1996) Adv. Chromatogr. 36:127-162; and 
Griffin et al. (1993) Appl . Biochem. Biotechnol. 38:147- 
159) . 

Other methods for detecting mutations in the 
Tango-77 gene include methods in which protection from 
cleavage agents is used to detect mismatched bases in 
RNA/RNA or RNA/DNA heteroduplexes (Myers et al . (1985) 
Science 230:1242). In general, the technique of 
"mismatch cleavage" entails providing heteroduplexes 
formed by hybridizing (labeled) RNA or DNA containing the 
wild-type Tango-77 sequence with potentially mutant RNA 
or DNA obtained from a tissue sample. The double- 
stranded duplexes are treated with an agent which cleaves 
single-stranded regions of the duplex such as which will 
exist due to basepair mismatches between the control and 
sample strands. RNA/DNA duplexes can be treated with 
RNase to digest mismatched regions, and DNA/DNA hybrids 
can be treated with SI nuclease to digest mismatched 
regions. In other embodiments, either DNA/DNA or RNA/DNA 
duplexes can be treated with hydroxylamine or osmium 
tetroxide and with piperidine in order to digest 
mismatched regions. After digestion of the mismatched 
regions, the resulting material is then separated by size 
on denaturing polyacryl amide gels to determine the site 
of mutation. See, e.g., Cotton et al . (1988) Proc. Natl. 
Acad. Sci. USA 85:4397; Saleeba et al . (1992) Methods 
Enzymol. 217:286-295. In a preferred embodiment, the 
control DNA or RNA can be labeled for detection. 

in still another embodiment, the mismatch cleavage 
reaction employs one or more proteins that recognize 
35 mismatched base pairs in double- stranded DNA (so called 
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"DNA mismatch repair" enzymes) in defined systems for 
detecting and mapping point mutations in Tango-77 cDNAs 
obtained from samples of cells. For example, the mutY 
enzyme of E. coli cleaves A at G/A mismatches and the 
s thymidine DNA glycosylase from HeLa cells cleaves T at 
G/T mismatches (Hsu et al . (1994) Carcinogenesis 15:1657- 
1662) . According to an exemplary embodiment, a probe 
based on a Tango-77 sequence, e.g., a wild-type Tango-77 
sequence, is hybridized to a cDNA or other DNA product 
io from a test cell(s) . The duplex is treated with a DNA ' 
mismatch repair enzyme, and the cleavage products, if 
any, can be detected from electrophoresis protocols or 
the like. See, e.g., U.S. Patent No. 5,459,03 9. 
In other embodiments, alterations in 
is electrophoretic mobility will be used to identify 

mutations in Tango-77 genes. For example, single strand 
conformation polymorphism (SSCP) may be used to detect 
differences in electrophoretic mobility between mutant 
and wild type nucleic acids (Orita et al . (1989) Proc. 
20 Natl. Acad. Sci. USA 86:2766; see also Cotton (1993) 
Wutat. Res. 285:125-144; Hayashi (1992) Genet Anal Tech 
Appl 9:73-79) . Single -stranded DNA fragments of sample 
and control Tango-77 nucleic acids will be denatured and 
allowed to renature. The secondary structure of single- 
25 stranded nucleic acids varies according to sequence, and 
the resulting alteration in electrophoretic mobility 
enables the detection of even a single base change. The 
DNA fragments may be labeled or detected with labeled 
probes. The sensitivity of the assay may be enhanced by 
30 using RNA (rather than DNA) , in which the secondary 

structure is more sensitive to a change in sequence, in 
a preferred embodiment, the subject method utilizes 
heteroduplex analysis to separate double stranded 
heteroduplex molecules on the basis of changes in 
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electrophoretic mobility (Keen et al . (1991) Trends Genet 
7:5) . 

In yet another embodiment, the movement of mutant 
or wild-type fragments in polyacryl amide gels containing 
5 a gradient of denaturant is assayed using denaturing 
gradient gel electrophoresis (DGGE) (Myers et al . (1985) 
Nature 313:495) . When DGGE is used as the method of 
analysis, DNA will be modified to insure that it does not 
completely denature, for example by adding a GC clamp of 
o approximately 40 bp of high-melting GC-rich DNA by PCR. 
in a further embodiment, a temperature gradient is used 
in place of a denaturing gradient to identify differences 
in the mobility of control and sample DNA (Rosenbaum and 
Reissner (1987) Biophys. Chem. 265:12753). 
5 Examples of other techniques for detecting point 

mutations include, but are not limited to, selective 
oligonucleotide hybridization, selective amplification, 
or selective primer extension. For example, 
oligonucleotide primers may be prepared in which the 
o known mutation is placed centrally and then hybridized to 
target DNA under conditions which permit hybridization 
only if a perfect match is found (Saiki et al . (1986) 
Nature 324:163); Saiki et al . (1989) Proc. Natl. Acad. 
Sci. USA 86:6230). Such allele specific oligonucleotides 
25 are hybridized to PCR amplified target DNA or a number of 
different mutations when the oligonucleotides are 
attached to the hybridizing membrane and hybridized with 
labeled target DNA. 

Alternatively, allele specific amplification 
30 technology which depends on selective PCR amplification 
may be used in conjunction with the instant invention. 
Oligonucleotides used as primers for specific 
amplification may carry the mutation of interest in the 
center of the molecule (so that amplification depends on 
35 differential hybridization) (Gibbs et al . (1989) Nucleic 
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Acids Res. 17:2437-2448) or at the extreme 3' end of one 
primer where, under appropriate conditions, mismatch can 
prevent or reduce polymerase extension (Prossner (1993) 
Tibtech 11:238). In addition, it may be desirable to 
introduce a novel restriction site in the region of the 
mutation to create cleavage-based detection (Gasparini et 
al. (1992) Mol. Cell Probes 6:1). It is anticipated that 
in certain embodiments amplification may also be 
performed using Taq ligase for amplification (Barany 
(1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, 
ligation will occur only if there is a perfect match at 
the 3' end of the 5' sequence making it possible to 
detect the presence of a known mutation at a specific 
site by looking for the presence or absence of 
amplification. 

The methods described herein may be performed, for 
example, by utilizing pre-packaged diagnostic kits 
comprising at least one probe nucleic acid or antibody 
reagent described herein, which may be conveniently used, 
e.g., in clinical settings to diagnose patients 
exhibiting symptoms or family history of a disease or 
illness involving a Tango- 77 gene. 

Furthermore, any cell type or tissue, preferably 
peripheral blood leukocytes, in which Tango- 77 is 
expressed may be utilized in the prognostic assays 
described herein. 

3 . Pharmacoqenomi c s 

Agents, or modulators which have a stimulatory or 
inhibitory effect on Tango-77 activity (e.g., Tango-77 
gene expression) as identified by a screening assay 
described herein can be administered to individuals to 
treat (prophylactically or therapeutically) disorders 
(e.g., acute or chronic inflammation and asthma) 
associated with aberrant Tango-77 activity. In 
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conjunction with such treatment, the pharmacogenomics 
(i.e.; the study of the relationship between an 
individual's genotype and that individual's response to a 
foreign compound or drug) of the individual may be 
considered. Differences in metabolism of therapeutics 
can lead to severe toxicity or therapeutic failure by 
altering the relation between dose and blood 
concentration of the pharmacologically active drug. Thus, 
the pharmacogenomics of the individual permits the 
o selection of effective agents (e.g., drugs) for 
prophylactic or therapeutic treatments based on a 
consideration of the individual's genotype. Such 
pharmacogenomics can further be used to determine 
appropriate dosages and therapeutic regimens. 
5 Accordingly, the activity of Tango- 77 protein, expression 
of Tango-77 nucleic acid, or mutation content of Tango-77 
genes in an individual can be determined to thereby 
select appropriate agent (s) for therapeutic or 
prophylactic treatment of the individual, 
o Pharmacogenomics deals with clinically significant 

hereditary variations in the response to drugs due to 
altered drug disposition and abnormal action in affected 
persons. See, e.g., Linder (1997) Clin. Chem. 
43 (2) :254-266. In general, two types of pharmacogenetic 
s conditions can be differentiated. Genetic conditions 
transmitted as a single factor altering the way drugs act 
on the body are referred to as "altered drug action." 
Genetic conditions transmitted as single factors altering 
the way the body acts on drugs are referred to as 
o "altered drug metabolism". These pharmacogenetic 
conditions can occur either as rare defects or as 
polymorphisms. For example, glucose -6 -phosphate 
dehydrogenase deficiency (G6PD) is a common inherited 
enzymopathy in which the main clinical complication is 
is haemolysis after ingestion of oxidant drugs (anti- 
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malar ials, sulfonamides, analgesics, nitrofurans) and 
consumption of fava beans. 

As an illustrative embodiment, the activity of 
drug metabolizing enzymes is a major determinant of both 
5 the intensity and duration of drug action. The discovery 
of genetic polymorphisms of drug metabolizing enzymes 
(e.g., N-acetyltransferase 2 (NAT 2) and cytochrome P450 
enzymes CYP2D6 and CYP2C19) has provided an explanation 
as to why some patients do not obtain the expected drug 

10 effects or show exaggerated drug response and serious 
toxicity after taking the standard and safe dose of a 
drug. These polymorphisms are expressed in two 
phenotypes in the population, the extensive metabolizer 
(EM) and poor metabolizer (PM) . The prevalence of PM is 

is different among different populations. For example, the 
gene coding for CYP2D6 is highly polymorphic and several 
mutations have been identified in PM, which all lead to 
the absence of functional CYP2D6 . Poor metabolizers of 
CYP2D6 and CYP2C19 quite frequently experience 

20 exaggerated drug response and side effects when they 
receive standard doses. If a metabolite is the active 
therapeutic moiety, PM shows no therapeutic response, as 
demonstrated for the analgesic effect of codeine mediated 
by its CYP2D6- formed metabolite morphine. The other 

25 extreme are the so called ultra-rapid metabolizers who do 
not respond to standard doses. Recently, the molecular 
basis of ultra-rapid metabolism has been identified to be 
due to CYP2D6 gene amplification. 

Thus, the activity of Tango-77 protein, expression 

30 of Tango-77 nucleic acid, or mutation content of Tango-77 
genes in an individual can be determined to thereby 
select appropriate agent (s) for therapeutic or 
prophylactic treatment of the individual* In addition, 
pharmacogenetic studies can be used to apply genotyping 

35 of polymorphic alleles encoding drug-metabolizing enzymes 
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to the identification of an individual's drug 
responsiveness phenotype. This knowledge, when applied 
to dosing or drug selection, can avoid adverse reactions 
or therapeutic failure and thus enhance therapeutic or 
prophylactic efficiency when treating a subject with a 
Tango-77 modulator, such as a modulator identified by one 
of the exemplary screening assays described herein. 

4 . Monitoring of Effects During Clinical Trials 
Monitoring the influence of agents (e.g., drugs, 
o compounds) on the expression or activity of Tango-77 

{e.g., the ability to modulate aberrant inflammation) can 
be applied not only in basic drug screening, but also in 
clinical trials. For example, the effectiveness of an 
agent, as determined by a screening assay as described 
5 herein, to increase Tango-77 gene expression, increase 
protein levels, or upregulate Tango-77 activity, can be 
monitored in clinical trials of subjects exhibiting 
decreased Tango-77 gene expression, decreased protein 
levels, or downregulated Tango-77 activity, 
o Alternatively, the effectiveness of an agent, as 

determined by a screening assay, to decrease Tango-77 
gene expression, decrease protein levels, or downregulate 
Tango-77 activity, can be monitored in clinical trials of 
subjects exhibiting increased Tango-77 gene expression, 
5 increased protein levels, or upregulated Tango-77 
activity. 

For example, and not by way of limitation, genes, 
including Tango-77, that are modulated in cells by 
treatment with an agent (e.g., compound, drug or small 
o molecule) which modulates Tango-77 activity (e.g., as 
identified in a screening assay described herein) can be 
identified. Thus, to study the effect of agents on 
cellular proliferation disorders, for example, in a 
clinical trial, cells can be isolated and RNA prepared 
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and analyzed for the levels of expression of Tango- 77 and 
other genes implicated in the disorder. The levels of 
gene expression (i.e., a gene expression pattern) can be 
quantified by Northern blot analysis or RT-PCR, as 
5 described herein, or alternatively by measuring the 
amount of protein produced, by one of the methods as 
described herein, or by measuring the levels of activity 
of Tango- 77 or other genes. In this way, the gene 
expression pattern can serve as a marker, indicative of 

10 the physiological response of the cells to the agent. 
Accordingly, this response state may be determined 
before, and at various points during, treatment of the 
individual with the agent. 

In a preferred embodiment, the present invention 

is provides a method for monitoring the effectiveness of 
treatment of a subject with an agent (e.g., an agonist, 
antagonist, peptidomimetic, protein, peptide, nucleic 
acid, small molecule, or other drug candidate identified 
by the screening assays described herein) comprising the 

20 steps of (i) obtaining a pre-administration sample from a 
subject prior to administration of the agent; (ii) 
detecting the level of expression of a Tango-77 protein, 
mRNA, or genomic DNA in the preadministration sample ; 
(iii) obtaining one or more post -administration samples 

25 from the subject; (iv) detecting the level of expression 
or activity of the Tango-77 protein, mRNA, or genomic DNA 
in the post -administration samples; (v) comparing the 
level of expression or activity of the Tango-77 protein, 
mRNA, or genomic DNA in the pre-administration sample 

30 with the Tango-77 protein, mRNA, or genomic DNA in the 
post administration sample or samples; and (vi) altering 
the administration of the agent to the subject 
accordingly. For example, increased administration of 
the agent may be desirable to increase the expression or 

35 activity of Tango-77 to higher levels than detected, 
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i.e., to increase the effectiveness of the agent. 
Alternatively, decreased administration of the agent may 
be desirable to decrease expression or activity of 
Tango-77 to lower levels than detected, i.e., to decrease 
5 the effectiveness of the agent. 

C. Methods of Treatment 
The present invention provides for both 
prophylactic and therapeutic methods of treating a 
subject at risk of (or susceptible to) developing or 

10 having a disorder associated with aberrant Tango-77 
expression or activity. Alternatively, disorders 
associated with aberrant IL-1 production can be treated 
with Tango-77. Such disorders include acute and chronic 
inflammation, asthma, some classes of arthritis, 

is autoimmune diabetes, systemic lupus erythematosus and 
inflammatory bowel disease. 

1 . Prophylactic Methods 
In one aspect, the invention provides a method for 
preventing in a subject, a disease or condition 

20 associated with an aberrant Tango-77 expression or 

activity (or aberrant IL-1 expression or activity) , by 
administering to the subject an agent which modulates 
Tango-77 expression or at least one Tango-77 activity. 
Subjects at risk for a disease which is caused or 

25 contributed to by aberrant Tango-77 expression or 
activity can be identified by, for example, any or a 
combination of diagnostic or prognostic assays as 
described herein. Administration of a prophylactic agent 
can occur prior to the manifestation of symptoms 

30 characteristic of the Tango-77 aberrancy, such that a 
disease or disorder is prevented or, alternatively, 
delayed in its progression. Depending on the type of 
Tango-77 aberrancy, for example, a Tango-77 agonist or 
Tango-77 antagonist agent can be used for treating the 
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subject. The appropriate agent can be determined based 
on screening assays described herein. 



2 . Therapeutic Methods 

Another aspect of the invention pertains to 
5 methods of modulating Tango-77 expression or activity for 
therapeutic purposes. The modulatory method of the 
invention involves contacting a cell with an agent that 
modulates one or more of the activities of Tango-77 
protein activity associated with the cell. An agent that 

10 modulates Tango-77 protein activity can be an agent as 
de scribed herein, such as a nucleic acid or a protein, a 
naturally-occurring cognate ligand of a Tango-77 protein, 
a peptide, a Tango-77 peptidomimetic, or other small 
molecule. In one embodiment, the agent stimulates one or 

is more of the biological activities of Tango-77 protein. 
Examples of such stimulatory agents include active 
Tango-77 protein and a nucleic acid molecule encoding 
Tango-77 that has been introduced into the cell. In 
another embodiment, the agent inhibits one or more of the 

20 biological activities of Tango-77 protein. Examples of 
such inhibitory agents include antisense Tango-77 nucleic 
acid molecules and anti-Tango-77 antibodies. These 
modulatory methods can be performed in vitro (e.g., by 
culturing the cell with the agent) or, alternatively, in 

25 vivo (e.g, by administering the agent to a subject). As 
such, the present invention provides methods of treating 
an individual afflicted with a disease or disorder 
characterized by aberrant expression or activity of a 
Tango-77 protein or nucleic acid molecule. In one 

30 embodiment, the method involves administering an agent 
(e.g., an agent identified by a screening assay described 
herein), or combination of agents that modulates (e.g., 
upregulates or downregulates) Tango-77 expression or 
activity. In another embodiment, the method involves 



WO 99/06426 



PCT/US98/16102 



- 93 - 

administering a Tango- 77 protein or nucleic acid molecule 
as therapy to compensate for reduced or aberrant Tango- 77 
expression or activity. 

Stimulation of Tango- 77 activity is desirable in 
situations in which Tango-77 is abnormally downregulated 
and/or in which increased Tango-77 activity is likely to 
have a beneficial effect. Conversely, inhibition of 
Tango-77 activity is desirable in situations in which 
Tango-77 is abnormally upregulated and/or in which 
decreased Tango-77 activity is likely to have a 
beneficial effect. 

This invention is further illustrated by the 
following examples which should not be construed as 
limiting. The contents of all references, patents and 
published patent applications cited throughout this 
application are hereby incorporated by reference. 



EXAMPLES 

Exam ple 1: Isolation and Character ization of Human 
Tanao- 77 cDNAs 

Cytokine genes IL-la, IL-1/3 and IL-lra have been 
found to be closely clustered on chromosome 2, i.e., 
IL-la, IL-ljS and IL-lra are located within 450 kb of each 
other. BAC clones containing IL-la and IL-10 were used 
to identify other proximal unknown cytokine genes. To do 
this, a BAC clone containing IL-la and IL-1/3 was selected 
from a BAC library (Research Genetics, Huntsville, 
Alabama) using specific primers designed against IL-la 
and IL-1/3. The DNA from the BAC was extracted and used 
to make a random- sheared genomic library. From this BAC 
30 library, 4000 clones were selected for sequencing. The 
resulting genomic sequences were then assembled into 
contigs and used to screen proprietary and public data 
bases. One genomic contig was found to contain two 



WO 99/06426 



PCT/LS98/16I02 



- 94 - 

segments of sequences which resemble IL-lra. These two 
segments are potential exons of Tango-77 gene. 

Two PCR primers were then designed from the two 
potential exons and used to screen a panel of cDNA 
5 libraries for the expression of a Tango-77 message. A 
cDNA library from TNF-a treated human lung epithelia 
showed a positive band of the predicted size (i.e., if 
the two exons are spliced together) . Using the PCR 
fragment as a probe, a single cDNA clone was isolated 

io from the same library. This cDNA contains an insert of 
989 bp. The cDNA clone contains three possible open 
reading frames. The first open reading frame encompasses 
534 nucleotides (nucleotides 356-889 of SEQ ID NO:l; SEQ 
ID NO: 3) and encodes a 178 amino acid protein (SEQ ID 

is NO: 2) . This protein may include a predicted signal 
sequence of about 63 amino acids (from amino acid 1 to 
about amino acid 63 of SEQ ID NO: 2 (SEQ ID NO: 4)) and a 
predicted mature protein of about 115 amino acids (from 
about amino acid 64 to amino acid 178 of SEQ ID NO: 2 (SEQ 

20 ID NO: 5) ) . 

The second putative nucleotide open reading frame 
encompasses 498 nucleotides (nucleotides 389-889 of SEQ 
ID NO:l ; SEQ ID NO: 6) and encodes a 167 amino acid 
protein (SEQ ID NO:7). This protein includes a predicted 

25 signal sequence of about 52 amino acids (from amino acid 
1 to about amino acid 52 of SEQ ID NO:7 (SEQ ID NO:8)) 
and a predicted mature protein of about 115 amino acids 
(from about amino acid 53 to amino acid 167 of SEQ ID 
NO:7 (SEQ ID NO:9) ) . 

30 Th e third open reading frame (nucleotides 372-889 

of SEQ ID N0:1; SEQ ID NO: 10) encompasses 408 nucleotides 
and encodes a 136 amino acid protein (SEQ ID NO: 11) . 
This protein includes a predicted signal sequence of 
about 21 amino acids (from amino acid 1 to about amino 

35 acid 21 of SEQ ID NO:ll (SEQ ID NO:12)) and a predicted 
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mature protein of about 115 amino acids (from about amino 
acid 22 to amino acid 136 of SEQ ID NO: 11 (SEQ ID 
NO: 13) ) . 

Tango-77 is predicted to be 35% identical to human 
5 IL-lra at the amino acid level. 

Example 2: Expression of Tanao-77 mRNA in Human Tissues 

The expression of Tango-77 was analyzed using 
Northern blot hybridization. A PCR generated 989 bp 
Tango-77 product was radioactively labeled with 32 P-dCTP 

o using the Prime-It kit (Stratagene; La Jolla, CA) 

according to the instructions of the supplier. Filters 
containing human mRNA (MTNI and MTNII: Clontech; Palo 
Alto, CA) were probed in ExpressHyb hybridization 
solution (Clontech) and washed at high stringency 

5 according to manufacturer's recommendations. 

Tango-77 mRNA was not detected in any unstimulated 
tissues (brain, liver, spleen, skeletal muscle, testis, 
pancreas, heart, kidney and peripheral blood leukocytes) 
mRNA on Clontech Northern blots. 

o Over 96 cDNA libraries were then tested for the 

presence of Tango-77 using PCR amplification. Only three 
libraries displayed a positive signal. These libraries 
were the TNFa-treated bronchoepithelium, TNFa-treated SSC 
cell line and anti-CD3-treated T cells. 

5 Example 3: Characterization of Tanao-77 Proteins 

In this example, the predicted amino acid sequence 
of human Tango-77 protein was compared to the amino acid 
sequence of known protein IL-lra. In addition, the 
molecular weight of the human Tango-77 proteins was 
o predicted. 

The human Tango-77 cDNA (Figure 1; SEQ ID NO:l) 
isolated as described above encodes a 178 amino acid 
protein (Figure 1; SEQ ID NO: 2) or a 167 amino acid 
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protein (Figure 1; SEQ ID NO:7) or a 136 amino acid 
protein (Figure 1; SEQ ID NO: 11). The signal peptide 
prediction program SIGNALP Optimized Tool (Nielsen et al . 
(1997) Protein Engineering 10:1-6) predicted that 
5 Tango-77 includes a 63 amino acid signal peptide (amino 
acid 1 to about amino acid 63 of SEQ ID NO: 2 (SEQ ID 
NO:4)) preceding the 115 mature protein; or preceding the 
115 mature protein (about amino acid 52 to amino acid 167 
of SEQ ID NO:7 (SEQ ID NO:8)) ; or preceding the 115 

10 mature protein (about amino acid 21 to amino acid 136 of 
SEQ ID N0:11;SEQ ID NO:12) . 

As shown in Figure 2, Tango-77 has a region of 
homology to IL-lra (SEQ ID N0:14). 

Mature Tango-77 has a predicted MW of about 13 kDa 

is and the predicted MW for the immature Tango-77 is 19.6 
kDa, 18.5 kDa or 15.2 kDa, not including post- 
translational modifications. 

Example 4: Preparation of Tango-77 Proteins 

Recombinant Tango-77 can be produced in a variety 

20 of expression systems. For example, the mature Tango-77 
peptide can be expressed as a recombinant glutathione-S- 
transferase (GST) fusion protein in E. coli and the 
fusion protein can be isolated and characterized. 
Specifically, as described above, Tango-77 can be fused 

25 to GST and this fusion protein can be expressed in E. 
coli strain PEB199. Expjpession of the GST-Tango-77 
fusion protein in PEB199 can be induced with IPTG. The 
recombinant fusion protein can be purified from crude 
bacterial lysates of the induced PEB199 strain by 

30 affinity chromatography on glutathione beads. 
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Example 5: Alternatively spliced forma of IL-lra and 
Tanao- 77 

Computer program Procrustes (Gelfand et al . , 1996, 
Proc. Natl. Acad. Sci. USA, 93:9061-9066) is an alignment 
5 algorithm that predicts the presence of alternatively- 
spliced exons for a protein of interest in a stretch of 
genomic DNA. Using the IL-lra sequence, Proscustes was 
used to search for the presence of additional sequences 
that might encode for alternatively spliced forms of IL- 

10 Ira in the two overlapping BAG genomic sequences (see 
Fig. 3 and Fig. 4) . Potential sequences that encode 
variant exons for IL-lra were identified. These 
predicted exons aligned well with the N-terminal region 
of IL-lra, but were not present in Tango- 77. The results 

is from Procrustes predicts the existence of more spliced 
forms of IL-lra. 

Furthermore, Procrustes also predicted an 
additional sequence in BAC1 and BAC2 that encodes an 
alternatively spliced exon for Tango-77 (T77-procrustes; 

20 Fig. 5) . This predicted splice variant form of Tango-77, 
T77-procrustes, was aligned with Tango-77 (Fig. 6) and 
with IL-lra and IL-10 (Fig. 7). 

PCR primers within this sequence can be used to 
generate a product that can be used to screen a panel of 

25 cDNA libraries using standard techniques. Suitable cDNA 
libraries include libraries made from TNFa-treated 
bronchoepithelium, TNFa-treated SSC cell line and anti- 
CD3 -treated T cells. The resulting cDNA clone (s) can be 
isolated from the library and sequenced to identify 

30 additional Tango-77 cDNAs. 
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Equivalents 

Those skilled in the art will recognize, or be 
able to ascertain using no more than routine 
experimentation, many equivalents to the specific 
5 embodiments of the invention described herein. Such 
equivalents are intended to be encompassed by the 
following claims. 



WO 99/06426 



PCT7US98/16102 



- 99 - 

What is claimed is: 

1. An isolated nucleic acid molecule selected 
from the group consisting of: 

a) a nucleic acid molecule comprising a 
s nucleotide sequence which is at least 45% identical to 
the nucleotide sequence of SEQ ID NO:l, SEQ ID NO: 3, SEQ 
ID NO: 6, SEQ ID NO: 10, the cDNA insert of the plasmid 
deposited with ATCC as Accession Number 98807, or a 

complement thereof; 
10 b) a nucleic acid molecule comprising a fragment 

of at least 300 nucleotides of the nucleotide sequence of 
SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO: 10, the 
cDNA insert of the plasmid deposited with ATCC as 
Accession Number 98807, or a complement thereof; 
15 c ) nucleic acid molecule which encodes a 

polypeptide comprising the amino acid sequence of SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:8, 
SEQ ID NO:9, SEQ ID NO:ll, SEQ ID N0:12, SEQ ID NO:13, or 
an amino acid sequence encoded by the cDNA insert of the 
20 plasmid deposited with ATCC as Accession Number 98807; 

d) a nucleic acid molecule which encodes a 
fragment of a polypeptide comprising the amino acid 
sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID 
NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID 
25 NO: 12, SEQ ID NO: 13, wherein the fragment comprises at 
least' 15 contiguous amino acids of SEQ ID NO: 2, SEQ ID 
NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO: 9, 
SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, or the 
polypeptide encoded by the cDNA insert of the plasmid 
30 deposited with ATCC as Accession Number 98807; and 
e) a nucleic acid molecule which encodes a 
naturally occurring allelic variant of a polypeptide 
comprising the amino acid sequence of SEQ ID NO: 2, SEQ ID 
N0:4, SEQ ID N0:5, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, 
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SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, or an amino 
acid sequence encoded by the cDNA insert of the plasmid 
deposited with ATCC as Accession Number 98807, wherein 
the nucleic acid molecule hybridizes to a nucleic acid 
5 molecule comprising SEQ ID NO:l, SEQ ID NO: 3, SEQ ID 
NO: 6, SEQ ID NO; 10, or the complement thereof under 
stringent conditions. 

2. The isolated nucleic acid molecule of claim 
1, which is selected from the group consisting of: 

10 a > a nucleic acid comprising the nucleotide 

sequence of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO: 6, or SEQ 
ID NO: 10 or the cDNA insert of the plasmid deposited with 
ATCC as Accession Number 98807, or a complement thereof; 
and 

15 b) a nucleic acid molecule which encodes a 

polypeptide comprising the amino acid sequence of SEQ ID 
NO:2, SEQ ID NO : 4 , SEQ ID NO : 5 , SEQ ID NO:7, SEQ ID NO : 8 , 
SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:12, SEQ ID NO:13, or 
an amino acid sequence encoded by the cDNA insert of the 

20 plasmid deposited with ATCC as Accession Number 98807. 

3 . The nucleic acid molecule of claim 1 further 
comprising vector nucleic acid sequences. 

4 . The nucleic acid molecule of claim 1 further 
comprising nucleic acid sequences encoding a heterologous 

25 polypeptide. 

5. A host cell containing the nucleic acid 
molecule of claim 1, 

6. The host cell of claim 5 which is a mammalian 
host cell . 
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7. A non-human mammalian host cell containing 
the nucleic acid molecule of claim 1. 

8. An isolated polypeptide selected from the 

group consisting of: 

5 a) a fragment of a polypeptide comprising the 

amino acid sequence of SEQ ID NO:2, SEQ ID NO: 4, SEQ ID 
NO:5, SEQ ID NO : 7 , SEQ ID NO:8, SEQ ID NO : 9 , SEQ ID 
NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, wherein the fragment 
comprises at least 15 contiguous amino acids of SEQ ID 

0 NO:2, SEQ ID NO : 4 , SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO : 8 , 
SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:12 # or SEQ ID NO:13. 

b) a naturally occurring allelic variant of a 
polypeptide comprising the amino acid sequence of SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO : 8 , 

.5 SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:12, SEQ ID NO:13, or 
an amino acid sequence encoded by the cDNA insert of the 
plasmid deposited with ATCC as Accession Number 98807, 
wherein the polypeptide is encoded by a nucleic acid 
molecule which hybridizes to a nucleic acid molecule 
o comprising SEQ ID N0:1, SEQ ID NO: 3, SEQ ID NO: 6, SEQ ID 
NO: 10 or the complement thereof under stringent 
conditions ; 

c) a polypeptide which is encoded by a nucleic 
acid molecule comprising a nucleotide sequence which is 
> 5 at least 55% identical to a nucleic acid comprising the 
nucleotide sequence of SEQ ID NO:l, SEQ ID NO: 3, SEQ ID 
NO: 6, or SEQ ID NO: 10. 

9. The isolated polypeptide of claim 8 
comprising the amino acid sequence of SEQ ID NO: 2, SEQ ID 
30 NO:4, SEQ ID N0:5, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, 
SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, or an amino 
acid sequence encoded by the cDNA insert of the plasmid 
deposited with ATCC as Accession Number 98807. 
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10. The polypeptide of claim 8 further comprising 
heterologous amino acid sequences. 

11. An antibody which selectively binds to a 
polypeptide of claim 8. 

5 12. A method for producing a polypeptide selected 

from the group consisting of: 

a) a polypeptide comprising the amino acid 
sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID 
N0:7, SEQ ID N0:8, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID 

io NO: 12, SEQ ID NO: 13, or an amino acid sequence encoded by 
the cDNA insert of the plasmid deposited with ATCC as 
Accession Number 98807; 

b) a fragment of a polypeptide comprising the 
amino acid sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID 

15 NO:5, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID 
NO:ll, SEQ ID NO:12, SEQ ID NO:13, or an amino acid 
sequence encoded by the cDNA insert of the plasmid 
deposited with ATCC as Accession Number 98807, wherein 
the fragment comprises at least 15 contiguous amino acids 

20 of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO: 7, 

SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 12, SEQ 
ID NO: 13, or an amino acid sequence encoded by the cDNA 
insert of the plasmid deposited with ATCC as Accession 
Number 98807; and 

25 c > a naturally occurring allelic variant of a 

polypeptide comprising the amino acid sequence of SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO : 8 , 
SEQ ID NO:9, SEQ ID NO: 11, SEQ ID N0:12, SEQ ID NO:13, or 
an amino acid sequence encoded by the cDNA insert of the 

30 plasmid deposited with ATCC as Accession Number 98807, 
wherein the polypeptide is encoded by a nucleic acid 
molecule which hybridizes to a nucleic acid sequence of 
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SEQ ID NO:l, SEQ ID NO : 3 , SEQ ID NO : 6 , or SEQ ID NO: 10 
under stringent conditions; 

comprising culturing the host cell of claim 5 
under conditions in which the nucleic acid molecule is 
5 expressed. 

13 . A method for detecting the presence of a 
polypeptide of claim 8 in a sample, comprising; 

a) contacting the sample with a compound which 
selectively binds to a polypeptide of claim 8; and 
.o b) determining whether the compound binds to the 

polypeptide in the sample. 

14. The method of claim 13, wherein the compound 
which binds to the polypeptide is an antibody. 

15. A kit comprising a compound which selectively 
is binds to a polypeptide of claim 8 and instructions for 

use. 

16. A method for detecting the presence of a 
nucleic acid molecule of claim 1 in a sample, comprising 
the steps of: 

20 a ) contacting the sample with a nucleic acid 

probe or primer which selectively hybridizes to the 
nucleic acid molecule; and 

b) determining whether the nucleic acid probe or 
primer binds to a nucleic acid molecule in the sample. 

25 i7. The method of claim 16, wherein the sample 

comprises mRNA molecules and is contacted with a nucleic 
acid probe. 
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18. A kit comprising a compound which selectively 
hybridizes to a nucleic acid molecule of claim 1 and 
instructions for use. 

19. A method for identifying a compound which 
binds to a polypeptide of claim 8 comprising the steps 
of: 

a) contacting a polypeptide, or a cell 
expressing a polypeptide of claim 8 with a test compound; 
and 

b) determining whether the polypeptide binds to 
the test compound. 

20. The method of claim 19, wherein the binding 
of the test compound to the polypeptide is detected by a 
method selected from the group consisting of: 

a) detection of binding by direct detecting of 
test compound/polypeptide binding; 

b) detection of binding using a competition 
binding assay; and 

c) detection of binding using an assay for 
Tango-77-mediated signal transductions 

21. A method for modulating the activity of a 
polypeptide of claim 8 comprising contacting a 
polypeptide or a cell expressing a polypeptide of claim 8 
with a compound which binds to the polypeptide in a 
sufficient concentration to modulate the activity of the 
polypeptide. 
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22. A method for identifying a compound which 
modulates the activity of a polypeptide of claim 8, 
comprising: 

a) contacting a polypeptide of claim 8 with a 
5 test compound; and 

b) determining the effect of the test compound 
on the activity of the polypeptide to thereby identify a 
compound which modulates the activity of the polypeptide. 
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^AAGTCTGCCTTGGCAGGCACTTGCAG* oTTTGAAAGAATCAGATATATC 
AAATTTGTAGTTTAAAATATTTAAGGGAACTCAATTAACTATGCTAGAAA 
AGAGAATTAAGTATTTAGGAGGATTTAATATGGTGTGAAAGTTGTGAAAA 
^CAAAATGGAGACACTAATGTTAAGAAAACCCTGATAAATGGAACCAGGG 
AAAGGCATGAAGATAGAGTTCTCACACTTGTATCCCTGATCATGAAAAAG 

ATCTGC 
>Concia5 

GGGTTTrrCCGCGTTTTTACCCGAAATCTTCAAGGGATGGGAAAAAGAAA 

ATTGCTAAAAAATCTCGGTTTTTTGGTTTTAACAGATATTTACACCNTGG 

\TCCCATTTATTATGTTGTCCCCAAGGTTTTCGGTGGGTTCCCAATCAGT 

^AGCCCCCCTCCACAGTGAAAGCACTTTACTTTATCACCTTCACCTAAAG 

CATAAAATCCAGCTCTTGAAAGCTGCTCCTTGTTAACTGAATATATCCAC 

ATCCCAAAAGTAATGATCCATGCTTCATAATCTGCCACGGATGGATGGAT 

GGATGGATGGATGGATGGATGGATGGATGAATGGATGGATTGATTTCTTG 

GAGGATTTGTTGAATTTGGGAAATTCCACGCCAGGACAGCTGGCCCAAAC 

^GCCCGCGACAATCTGCTCGGTACAAGGGGAGGGTCCTGGAGAGGGTGCG 

GCCCGAGCCCCAGTTTGGAAATGCCAACTTGGCTCTGCAGCCGGGCCTTA 

GCCACTTGGGTCTGGCGTCCCTCCATTATTAGCGCCATGCCGGCTCGGGG 

TGCTGCCAAGTCCCTGAGAGCACAAGCC 

>Concig6 

CGCGCTCAAGAAAAGCTGAAGTGTGAATGTTCTGTCTACCTTCACAGTAA 
ATG CTAAGAGAATGAC CCAAGAGCAGAGGGTATCACTCTG CTACGGAGGA 
""TGATTGTAACTGGCTCTCCTGCCTTAGCAAGAAATGCCAGAACCATGGT 
CATTCAAGTTCTTGACCAAAAACTGCCTTCATGAGAATCAACTTCCCCAA 
GAAAAAAAAAGCAGAAACAGGCAAAGCTTC CAGCATGGTAGGTAATACTG 
ACCCTTCTTCCCTCCTTCCTTTGGAGATTCACACAGTAATAATGCATAAA 
GCTTTGCCAATGGACTAAGCACTGCCCAGGGGTTTTTGTCATGCCTGGAC 
TGAAATGCTCTTTTTGCGTTATCATAGAATCCCAGTGCAGTCTGAGTAGA 
CTCTAAGCAAAAGGGACATTTTTCAAAAAGGCTTTAAATTGCTAGTACAA 
AGAAGGCAACAAAACTTGCGTAACTGTGGACAGATTAACTCACTTGGTGT 
TTTGGCTCTTCAGTTTTCCCTTGGCTGCGAAGTACTCCTGAAGCTTTCTC 
TGCGGCTCTTCCTGCAAGCAGGCAAGCAAAAAAACGACTGAACTTTATTT 

CGAGAT 

>Contig7 

GAAGAGCCGCTAACTTGCTG7AGTGATAAGGAATGAACTAAGGCTAGGGA 

CATATTAACATCCGCTGGTGGTGACTCTTTAGCCTAGATCTTACCCCACT 

^CTGCTCCTTCCATATGGTTCGGTCTCAGGCTCACTACCGATCAATGGCG 

~ACTAAAAGCACTAACTATAGACTCCAACACGTCTGTCGTGTGTTTCACG 

ACAAGCCGTGGAGTTAATCCCTCTGACAGTAGCTCAGATAAGGATGGGCT 

ATCATGGGCCCGGAACTGGGGCATGACGCTCGTCACCAACGCATGAGCTC 

CCCAAGTATGCTATACCTGTCCCTATGAAGGGCTTCCAACTCTATGTGCA 

GTCCCCATGTGGAGAGTCAGGTATTGATTGATCAAGCCAGGGGTGTGGTG 

AATGGGGAGCTTCCTACAGGGGTAATGATAATTGAAATGCACGGTGATGG 

GGATTTTCATATTGGTCTCCTAAGGAGATAACAGATTGGATGCGGGGTCG 

ATATTCCACTGCCCAGGGTGTGTACCGAC3GGTATCTGCAGGTGGATCTCC 

TCCCCACGTTTGATTAATACTCCTGTCTTGGGAAGCATAGACGGGCGGGG 

GAAATGATGAAGGGTGACCACTC CCC 

^ContigS ^ m 
GGGAACGCAGTGCTCTGTACGATGGCCTTGATTGCGAATTCCTGCAGGGG 

GGG 

>Concig9 „ 

GGCAAGAGATTTAATATTCATTCCATCTTCATTTGGAAGATGAAAAATT^ 

GGGACCAGAGAGGGGAGGGGACTGGGCCAAGTTTTCAAAGAAAAGTCAGT 

AGGAATTGTGAATTCCTGGGGGCCGGGGCCCATTAGTG CTGTT TTGGATC 

AGTAAATGGAGATGTGAGTTTCAACAGTAACAGGGACATTTTAAAATTAA 

AATGATTTAACCTTTAGAAAATGTCCTATTTTGTAATAATGATGGATTCA 

CAGGAAGGTACAAAGAAATGTCCAGAGAGTTCNTGAGCCCCCTTCAGCCA 

GCT^C^TCCAATGTTAACATCTTGCATTATTATAGTACAACATCAAAACT 

GGGAAATCGATATTGGTACTGTCCAGATAGCTTACTCAGATTTTGCCAGT 

TATACTTCCACTCATTTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG7GTG 

FIG. 3 (2 o£ 52) 

Hint 



W ° 99/06426 PCT/US98/.6.02 



TGTGTGTAGCTCTATGCnATTTTATGl ^ * G7AGC77CA7G7AAGC2LCGt» J 




. i. i «v.««v. i ^ A « >J .y.v. i.(-i-rcATCTCCAAACCCTAAGCCCACCTCACCGCC 
TCCACCATCTCTAATGCCTGGCAACCACTATTCTGTGCTCCATCTCTGTA 
ATTAATTGTGTTAATTAATGTTATACAAATGGAATCATGAAGTATG7G7C 
CTTTGAGATTGGGCTGTTAATTTTTCACTCAGCACAATTTCCGTGAGTCT 
AATCCAACTTGTGTGTAGCAGTAATTCTTTCCTTATTATTGCTGAATAAT 
ATGC ^Z GGTATG ^ TGTAT CACAGTGTGTCTAATCCTTrGCCCATTGAA 
AGGAA777GGA7AATTTCCAGG7Tr7GGC7A7TA7GAAT^ 

AAGACATGTGTGTACAAATTTTGGTGTGATCAAAAGTCTCATTTCTCTGG 

GATAAATGCCCGGTAATGAAATGGCTGGGTTGTGTGGG 
>ConciglO 




:;"izrf 1 ^^xcatcatccctgttccagaagattacacg 

TACGACCACGCCTGGCTCACCGACTCACGTGGGCCAGTACCAGAAATTCT 

S»?^^^^SS^2J^ G ^^^ GAA ^ C ^^^^^^^^C^C-CACGGTTA 
GAAAAGCC * GTTTl'CAAGTCrTrtr:a i -rrnrra r a t» -it* /-.r-r.^^^m» . -^m 



GATTGT 

9^£Z*-E2_rirr" ^**'" -*'"- k> - ft '- 1 1 Awv*JAv»vjv,i«ftAAutAeQAGGGTCA 
CTTGAGC . -AGGAATTTGAGACCACCTCTGGCAACATAGTGAGATCCTGT 
CTCTACAAAAAATTT 1 1 l'lTAAATTATCCAGCATGGTGGTACACGCCTGT 
A77CCCAGC7AC7CAGGAGAC7GAGG7G7GAGGA77GC7AGAACC7GGGA 



GAGAGCACA7A7AA7G7TCCC 
>Concigll 

GG7CTAACATGC7CCAAC7GAAGAAACCCCACACT7G7CCGGCAAGGAAA 

CTAC7ACAGATTTCCTGACCTACTG7GCAAT7CGGGGCATGCGACGGGAC 

TGTGTTTCTGGG7ACGCTG7CTCAGGTTCGTCTGGGATGTAAGAA7TCAA 

CT7CAG7AGTrC7CTCATAGACGCCGACGAGAGGGGCGTC7C7TTTCTCT 

GA7GAA7C7GCCAGA7CTTCCACTTCA7AGAG7CTAAATCC7CCGATTCG 

A7C7AC7GGAGACCCCCACGT7ACAAAAACG7C7AACGTCGGTGACAGCT 

CCCCACATAGGGAAAGA7CACCTGAGTC7CAC7ACC7CACATTAGTGCTA 

TCTCCAGCCCCA7GCTATCTACGAGA7GG7CACGCGAGG7TTAAGGGG7C 

TCCGA77CCGG7GG7CCGA7TCAGC7AA7CG7GGCCC7ACGTGAACGA7C 

AC7CC7GC7CGTAACATCGATACAGGG7CGCGC7GACAAA7GG7AC7ACG 

7AGG77C7CAGG7CAATGCCGCGTCACGAATGAGCCTAAC7ACCCCATAA 

GTGCACGTAC7G7G7TACC7TTCC7GTTCGGCCAAACC7GCTAC7GTATG 
CTGTGCTTGTTT 

>Concigl2 

AGGC7CCA7GTGCTCTAGCCTGATTATC7TTTCAAGTGTTTTATTTGCTA 
ATCTA7AAGGCCCCTTTCGTAAAATGTTCACTCATTTTCTAATTAGATAT 
1 1 l i l i7AA7GTTGAGTTTTGAGAGTTCTTTAGA7A7TrTAGATACAAG7 
c £^G7CAAATATGTGATTTAOUUVTATra 

G IF^^^^^^^^^ < ^^''^ , ^^^^^QCAAATAATTTGATTTTC 
ATAAGG77CAAATTArtAATTTTT7CT7GTATAG7TCACACTTCTAGTGT 
7AAGTC7AAAAACTGTGCCTTG7CA7AGGTACCAAAGGTTTTCTCCAGT7 
7TTTT7CTAGAAGTTTAGAGTTTCATGT7T7ACATTGGAGTCCATGATCC 
ATTGTTAArrAATTTT7GTATA7AGGTAGA7GTT7AGGTT7AGGGTTTTT 
7TAAAAAAAAATTACATATGTTTAATTGCTCCAGTTCCCTTTCATTGAAA 
AGGGTA7CCTTCCTCCATTGAATTGCCT7TGTCAGAAATTAATTGGACAT 
ATT7G7G7GAG7CTATTTCTGGGCTCT77ATCA7GTTACTTTTAAAAAAT 
G CAT CAGTTC CTCCACCAATACCT CATTGTCTTGATTATTGCAGTTATA7 

AGTAAGCC77AGCA7TAGGAAAAGTG77TTTCCTGCTTTATTC7TTNTCA 



FIG. 3 (3 of 52) 



WO 99/06426 PCT/US98/16102 

CTGCTAATCTATGAACAlrtGCATCTCT-- . CAAAGCATTTAGTCTTTCTT 

AATTTCTGTCATTAATTTTTTAAAATTTTCATCCTAAAGATTCTGTATAT 

GTT^TGTTGAATTTATGCTTAAGCATTTCACTTTCTTGGTAACAATTATA 

AATGATTTTGTGTTTTTTATTCCACTAGTTCATTTTCAGTGTGTAGAAAA 

GCAATGAATTTTTGTGTGTTGATCTTTGTTCC7ACATCTTGCAACATTAT 

TGAACTCATTTATTAGTTCTAGGAGGTTTTTTCATTTTTCTTGTAGATAC 

C^GAGATTTTCTATATAGACAGTCATGTTGTCTGCAAACAGGCACAGTT 

TTATT^CTTCCTTTTCAATCTATATGCCTTTTTTTTTTTTTTTGCCTTAT 

TGCAGTGGC7AGAACTTCTAGCACTATGTCAAATAGCATTGGTGAAAGCA 

GACATCCTTGTTCCTTGTCTTAGAGGAACATT7GGTCTTTAATCTTGGAT 

TGCG 

>ConcigI3 



GCGCC^CCTTTTCTCTTCCAAAATTTCTCTTGTCTAGTTATTTGTCCAGG 
GAAATTTGAAAGCTCACTTACTGTGCAAGTCAGCAGGAAACAACTGGGTC 




CTTTTTCCCAGTCCTCAGAGCCCATTTTCACTCCTGAGTTCTCCCCCACA 
AAGGACATTTTCAACGTTGAGTTTATTACTCAACAGAAAATGGAATGAAG 
TCCAAGACCTAAGGAGATAGAAAGGGGACCAGTTATGGCATCTTCTCACC 
CCAGGACACCTTGCTGCATGTCTCTAGTGCTGAACAGACCACTGGCCTTG 
CTC T GTAGTTTGAAATGCTCGCTGCAACCAGAAAGGCACCAAGGGGCCAG 
ACCATGCTCTCCTGTCTATCACGCCTTCAAAGCAGAATTTCCCAAACCTT 
GAGTCACAGTGCTAACACACGGGGTGCCATAACATTTTTGTTGATTTTGG 
CATrr~ACAAAAATAAAATAAAAAAGTTAAAAATGCATTGCTCTATTCTT 
GGGGCT3GCACACTATTGCCTTTGGCCAAATCCGG7CCCTGACTGTTTTT 
TTAAATAAAGTTTTATTGAAACACAACCATGCTCTTGTGTACATATTGTC 

TCTTGGCT G CTTC G AAGCT ACAATA 

GTGTTCGCTTTTTAACACTTACCTAAAATTACTCTGTAATCCATGGATCC 
TTAATTTATTTAAAAAACTAATGTTAATGAGTAGCTTTATTTTCCTCCCA 
TCTAATTTAAGGCCCACAGAACACCTTCACTTACCTCAATCCTCTCCCAA 
CTTACATGCTTTTAATGTCATATATGTTAATACCGTATACTTTTAAAACT 
TTCrAAAATAGCATTATTTTATAGCATGAGTGTTCATTTACATTTTTGCA 
TATATTTAGAATTrTCTTTGCTCTrCGTTTCTTCTTCTATTTATGACTCC 
CCTCTGGGATCATTTTCCTTCTACTTGAAGTACATAGTTTAGAACTGCAC 



CTGCNACCATTTACCGGGAGAAGGCCTGGAAATCATCACCAGGCCAACTC 
AAAATTATTACGTTTTCCAGAGCTTATATACCTTCTAAGCTATATGTCTA 
CGTGTAAGTGTG CATTCACCTGAAGACGTTAGTGATTAACTTCTTTTAAT 
. - ./^«m»^^»/^^wv?iart&Tr'rT , rrrrTRf5aGC!CTCAGTAAA 




TTTACTTAATv» /VwV1\jo\ji wuwviiwin l www inn^ * • »■ » - — - - — 
TCCCCTGCTAAATCATGGAGGTTTGGGGATTCCTTTAGAGCACCAATAAA 
CTTGTTTGTGGAGGCCTGGGGGTTTCTTCTGACCCACAATAAAAC'r l'GTT 
TAATCCTAAATGGGTCCTGTTAAGAATTCCTrCTTTATTTTGTCATATTT 
TAAGGCCCAGAAAAGGCCTGGGCAAAACTCTTGATGGGCTTTTGTTACAT 
^CCAGCC^TGTATAAGAACACTGGTTTTTAATATTTAACTTAACCATTT 
AGTCAGTACTGAAACAGTTGTTATAGAGATCTGCATTAGTGAGACCTGGC 
CTGCCACATTTCCTTTTCTGAAGATCTTATGGTAGTGATCACCTTTGTGA 



TTT'~ T TGAAACGAAGTCTTGCTCTGTCACCCAAGCTGGAGTGCAGTGGAT 
CTCAGGTCACTGCAACCTCCACCTCCCGGGTTCAAGCGATTCTCCTACCT 
CAGAC7'""'"GAGTAGC7GGAATTACAGGCACCTGCCACCACGCCTGGCTA 
ATJ ^ATAT"TTTAGTAGAGACGGGGTT7CACCATGTTCATCAGGCTGG 



FIG. 3 (4 of 52) 



WO 99/06426 



PCT/US98/16102 



7C7CGAAC7CC7GACC7CG7GA7CCGC . ^CCCCCTCGGCCTTGTTTGCr a. 

GAGGTACTGTCTAAATGCTGGAACTGAAAATGGCAAGCAAGACATCCCTA 

CCC77GAGGAAAC7G7AA7C7AG7CGGAAA7ACAGA7G7CAACCA&G7C7 

CACACAAGAANATTGTACAAAACCCCTAGGA 

>Contigl5 

GGAAAAAC CTATCACCGCCTCCTATGGAACTTAAAACAAAAAGAAAAGTA 

ACAAAGG AAA7GAATATTTCAT7 C7GGAAGAACA77G AAAAAGAACAGGA 

AGAAGAGAAAGCACAACTCGAACTGTCCACTAGAATTGACAACACTCTGA 

CAGAATGTCTGAACCTCATCGAAGGGGTAAGTGAAAAAAATAAGCTCCTC 

CAGCTTTGGCC CAAAGTCTTATAATTTTTAAACATATTC CTAAATATAAT 

ATAGGAGAGATAGCCTTCATCTAAGTAGAAATTTAGCTACTCTTGTAAAT 

ACAGAGTAATAATAATAATGACATGCCCATAAACAGTGTCTTTTGTGTAT 

CTGTGCTTTTATAAGCACTTAGCTAAGATTATCTCACATAATTATCATAA 

CCACTGTTACTATGACCACTTTACAAACAAAACTGAGGCACAAAGAAGTT 

GGAAAACTAATCCAAACAAACTGGCTCCAAAAGGAACTTTGCTTTCTTTG 

GGTATCAAGTTCTGAAGAGTACACATTTAACATTGAAACTGAGGTCAGAA 

GGCAAGTTTCTATGTAAAGTTGGAGTATTCTGAATACTCTGGGTAGCTAC 

AAATAGTATTTAAATTTTATCTTGGATTCTGCAGATAAGGATAAAATAGA 

TGGTAGGCAAAGAGTATGATCCTTAGGAGAAATTTTTCCTGAAGGAAAAA 

TATATTAATAAAAAATGATGGAATAAACTTCTAAGATCCTTGCCTAGAGC 

AAAACTCATTCAGTCCTTTGGCTGGTAATGTTGAACATCAACAAAAAAAA 

GGAAAAGTTCAGTTTAAGTCTACTCCAGGCAACATTTTCACAACATCCAG 

77AAA7A77AAC7A777C7C777G7GGAA7TGAAC7AGAG77C77777C7 

7A7CC7C777777GG77G77GTA77A777AAAAA7GAG7ACC7777TA77 

ATTGAAATCATTTCAAGTAATGCAGATAAATGATCAGCCCTCTCCCTGTA 

CAAACATACATACTTAGGCATCCCAAACTTCTCTCTGGAGGTGACCACCA 

TTGCCAGTCATTCATTCTGTTTTCATGCATGTCCATACAGTATAGGTATG 

TCGAGAAATGAAGTATTATATTTTTGTGAGT TGCAA rrCTTTTATTCACA 

TTTTTGTGTACTTTGGTTGTCTTTTCTTGTGTTTTCCTAGTACCAATGTT 

ATGCTGACTTAGGCAGATGAGTTGAGTATTTTCCTTTTTGCCCTATAAAC 

7GAAAA7AG777G7A7GACA7GAGAA77A77777A777777GAAGG777G 

ATAAAAACTTGCCCATAAAAATCGTCTGGACCGGTTTCTTGAGGA7GCCT 

GTGTTAGAGCC 

>Concigl6 

CGCTTTAACCTGGGCTACCAATGGTTCGTCAAGTTCTAGATTCTCTATTA 

ATACCTTTTTCTTGTGTCTTTCTCTGGTCTGTTTTCAGCCCCGAGTCTCT 

7AGA7C7G7CC7C7AA7A77CC7A77GAC777AC77CA7777C7AAG7C7 

TTATCCrrTTGCTTTACTTrCC GAGA GACCTGCTT AACCr rATCTCCCAA 

C7C7TITATTGAATTTCATTTCTTTTACTATATATTT7TTACTTTGAATA 

CACCTCTCTCTTCCTCACATTTTCCCCCATAGTATTTTGTCTTCAATTGA 

CAGTTCTACTATCTTATTACTCTGGAGATATTAATAATAGTTTTTAAATT 

TTTATTTATTTTTATTTTCAAAACAGTGTCTTACTCTGTCACTCACGCTG 

GAGTGCAGTGGTGTGATCATGGATCACTGCAGCCTTGATCTCTGAGCTCA 

AGCTATCCTCCTGCTTCAGCCTCCCAAGTAGCTGGAACCACAGGCATGTG 

TCACCATACCCAGCTAATTTTTTTGTTTTTGAGGTGGAGTCTCACTCTGT 

AGCCCGGTCTGGAGTGCAGTGGTGCAATCTGGGCTCACAGCAACCTCTGC 

CTCCTGGGTCCTGGTTCAAGCAATTCTCCTGCCTCAGCCTCCTGAGTAGC 

TGGGATTACAGAAACACACTACCATGCCCAGCrAATTTTTGTATTTTTGT 

AGAGACAGGGTTTCACCATGTTGGCCAGGCTGGTCT7GAACTCCTGACC7 

TGTGATCTGCCCACCTTGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAG 

CCACTGCACCCGGCCACTAATTTTTAAATTGTTAATAAAGACGAGGTCTT 

GCTATGTTGCCCAGTATGGTCTTGAACTCCTGGGCTTAAGTAATCCTCCT 

GCCTCAGCCTCCCAAAGTGTTGGGATTACAGGTGT GAGCCA CTG AATCT G 

ACAl M llUl M rAAAAGl w riUriTCTCl w riACCAAGTCTI M nTTCCCCTTTCT 

G C TTTTTTGGGTT G TTTT A TTrTGATCTCTATCTTGCTAGAAACTTTCTG 

CAGACGTTTAGTAATACTAGATTTTTGAGAGTGGGCAACTGGAAAGCTGA 

TTGGAAACTCTGAATACATGGGTGAGGCTTGTTGGCTGTGAGTGTCATTG 

CTTGATGTCCTGGCAAGGCCAATGGGTTTGGGACCCCTACTATTAGTATA 

GGCCTGATTCCCTGGGAAAGGCT C T TT TGATCTCCTGCCTGGAGGATAAA 

GGCCTGGCTACCAGCCT7CTG7GTG7AATGTGAGGGAGAAGGGCTGGAGT 
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ATTCAACATCATGCTGAA j. CCTTTCAA * ^TCATCTTGTTTTTAGTAATL 

TCC^ACCTTAACTCTCTGTCTTCTGCTAGTATGGGAAAGATGACCTGAAA 

ATC~AACCATTTATTTTTCCCCCATTAATATCATTTTATGATTATTCAGA 

AGT^AAATAATTGTCATGCTGTCCTCCAAAAAGACTGAATCAACTAGCAA 

CAAATAAGAATTTTCTCACAGCTCTGCCAGCATTTTAAAAGAATAGCTTT 

ATTGAGCCCAGGAGGTCAAGGCTGCAGTGAGCTGTGATTACACCACTCTA 

"CCCAGCCTGGGTGACAGAGCAAAACCCTGTCTCAAAAAAGAAATTTAAG 

GAACAGC7TTATTGTTGTAAAATAGACATACAATAAACAGAGCACATATT 

TAAATTGTGCAACTTATACTTTGATATAACCCTGTGAAAACATCACCACA 

ATCAAGATAGTGAATATATTTATCACCTCCTGATACAGTTTAGCTCTGTG 

TCCCCACC^AAGTCTCATGTTGAATTGTAATCCCG^ATGCTGGGGGAGGG 

GCTTTGTGGGAGGTGATTGAATTGTGGGGGTGCACTTCC CCCTTGCTGTT 

rrrGAGATAGTGAATGAGCTCTCATGAGCTCCCCTTCACTCACTCTCTTT 

CCTGCTGC CATGTGAGGATGTGCTTGC CTCTTCTTTGCC CTTCTGC CATG 

ATGTGTTTCCTGAGTCCTCCCTAACCATGCCTCCTGTACAGCTTGCAGAA 

CTGTGAGTCAGTTAAATCTCTTTTCTTCATAAATTACCCAGTCTCAGGTG 

GCTCTTTATAGCAGTGTGAAAAGGAACTAATATACCTCCTAAGTTACCTC 

AAGCTTGTTTTTAATTCCTTCTCCTCCCTTCCTTCATTGCCAAGCAAACA 

ACCACCTGTTTTCTGTCACTATAGATTAGTTTACATTTTGTGGGTTTTTT 

TTTTTTTTGAGACAAGGTCTGACTCTGTTGCACAGGAGCAGAGCAGCGTA 

TC 

>Concial7 ^ 
CGCGT^ATAGGAGATGCGAACTTAAGAAATGATGATAAGGAGACTTTA x . 

AAATATAATTTTGAATTATTTTGCCATTACAGAAATTCTAATTATTTAAA 

ATTCTATTCATAATTTTTAATCACTGTACTTCCCAAGCTTAGCTTAGAAT 

CCTTCTGTGCTGAGGATTAATTTTAATTTGTCTTTTATAGGCCTTATCTA 

AAATC CAAG AATAATTGC CAGAATCAAC CACCTTCTAAATCTGTAAGTAG 
AAATTAGTCTTTTTAAAAATATGCATTCATAAGTATGATTAGTAATAAAA 
ATAATAAAGATGTTAGCAACCTAAAGAACATGTATTTGAAAGGTATTTCT 
TACAGATATAAAAACAGTTTGGTTTAATAAGAGACAATCATTTTTTGAAA 
AGTATGACATTrTTTGAAAAGTAGTTTAGTTTTATTAACCAAGAAAAGCC 
^CAAGTGAACTTTAGTCCTCTTGATAGCTAACATTTATTGAATGCTTACT 
GTGTGCCTGATACTTTTCTGACTTGCATTACCTCACTGAGTCCTCACAAT 
CTTATGAGGCTACTATTAGTAGCCCCACTTTACAGATGAGCAAACTAAGT 
CACAGAAAGGTTAAATAGGTCGTATAGCTATTAAGTGACAAAGCTGAGAG 
"CTGTGATCTTAACCACTTTGGTATGCTGCCATGAAGTTAAATAGCTCAG 
~ AG T C ATT AAAAG AG AACATTT GCATTGAACCTT C CAAGC CACTT AAC AA 




GCACAGATTATTCATCTGCAATAAAAATCTCACCTTTAG11 1 1 1 1AAAAC 
ATAGTTTTTATCTGTGTTCTGAAATGTAACTAAAACAGTGCTTCCTGAAG 
TGAAAAATTCTCACTGGTGAQAATTTTAATAAGTTTTAATGATTCACCAA 
ATCACTTCAGTCATATTTCAGTCATATGCATATGCATATATAGACATATA 
AGTTrTTATCTGTGTTCTGAAATGTAACTAAAATAGTGCTTCCTGAAGTG 
AAAAATTCTCACTGGTGAGAATTTTAATAAGTTTTAATGATTCACCAAAT 




' GTTGTATu i A 1 AUA i uAU\ i. i inwiwtw a « * w«ww* * «-»ww-- 

TATATAAGGCAAAATTTATGAACAATGGTTTAACGTTTGGGAAGCACTGG 
GTTACACTTTTACTTTATGCAGATTGAACCAGTATAGTATGCAAGTCTTA 
AGGAAAAATCTACTGGAAAGGGCCCTCATTCAGACTTCCCAGAGGCTTCT 
CTGGAAGTTGACAATACTGACTTCAGTACATCAGCTCGTAAATGAGGATG 
ATACCTACCTTATCTGCTTTACACAGTTGTAAAAGTAAAAAGTGAACTCA 
GGAAGGGAATTACAGAATTTAGGAGAAACTAAAAGCACGATGTAAATAAT 
AGTCATCATTACAGTTATATAATGCTTGACAATTTATATAACACTTTCGA 
T ACAT G ACAACAAT AACTAACAC C CAGACATGTTT ATAT ACATT AC CT CA 
CTCAGAACAACCATGTGAGGAAGTTGGCCATATGCTTTAATGTCCAAACC 
AGGACAC' r iTTGAGAGTAAAAAGCAGTACTCTTTGACCAACAGGCATAAA 



^CAAAACTATCTTGTGAAAACCGGGATATATGGCATCCTTCCTAGATAAT 
AGATAC*~^TACTATTATTAATTTTGCTGTGAATCTAAACCTGCTCTAAA 
AAAGTTAATrTTAAAAAGTAATGAAGTACTGATACATGCTACAACATGGG 
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AGTG T C CAAAAACTTTTATCACACACAT C CAATTTTr a TnrrrrsrRT^ 
TTTCAACTCTA<^TTTACAAATTATTTTr.Tr.T^™ 4™^~^±C^T±f „ 
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ATAACAAGGCCACCTTT. x'GCTAGCCA .CCATACTGAAA GAGCRftT GL - 
T GTTGCCCCATGCTGTGGGTTCCATAGCTAACGTTCTGCCTTTTTTCCTA 
CCACGACAGCCTGGGTTTGGTTCCTAAATCAAGCCTTTTCTGGTTTGATA 
CTTGGTAATGCTGAAATAGCAGCAATTTGTCCTAGCTGAAATATCGTAAT 
AAGATTTTAAAAGATTTATTTTAAAGGACCTCAATAGTTAAAAGTCAGCT 
TAATTAAAAGCTAACATCCAAGATGTGTGCATGTGTATGTATGCGTCTTT 
GTAT^AAATAGCCCTCATGTTTTTTTTTTCTTTCCTAGGAACTTGCCTT 
TTGAGCAAAAGTTTTTTTCTTCTCTGTTGACTGGATTCTGTTTTCTT 




GG7GCCAGGCTCCCTTTTAGGGAGAAACTTCTATTTTTCCTTATGGAATC 
CCTAGAGTGTAAACAGACAAGTTCATTTCAGCTCT TAAA CTGCTTGCGTT 
TGTGTTGTGTTACCTGATTTTTTTGACTATTATATTTTTGACTAGCTATT 
GCAACAGAAGCTACTCTTGGGTTTTCAAGGAAGATTGTAGTTTAGACATG 
TAGAAATGTCTTTTAAAAAAAAAACAAACTTTTTTTTAAGTGCACTGTAA 
AAGCATCATATGGTCTAGCCTCCTAATAATTTTCCCTTTTTGGAGACCAG 
GATTCAGGGTGGGCTCTGCCCAGAGCTCAGAGATCCAGTTAAAAGAGAGG 
TAGTCTCGGCCGGGCGTAGAGGCCCAGCCTGTAATCCCAGCACTTTGGGA 
GGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCTGGCCA 
* & & rrrrr.TrTrr^rTA AAAATACAAAAATTAGCTGGGTGTG 



GTGGCAGGTGCCTGTAGTCCuAtiCUAU i i \j/vjww*«#ww~w 

AATCGTTTGAAC C CGGGAGGCGGAGCTTGCAGTGAGACGAGATGGCGCCA 
C^GCACTCCAGCCTGGCGACAGTGAGACTCCGTCTCAAAAAAAAAAAGAT 
AGGTAGACT CGAT GTTGTCGTACC CGAGCAAGTTAGAGCAACGCCACACT 
TTGAGACGAATTTAAGAGTCCTTTATCAGCCGGCGACCAAGAGACGGCTA 
ACGCTCGAAATTCTCTCGGCCCCTTGGAAGGGGCTTGATTTTCCTTTATG 
CTTTGGTTTAGGAAGGGGAGGGGAGCTCAGTTGCAACAATTCTACAGGAG 
TAAAAACATGCAAAGAAATTAAAAAGACAAGTGGTTACAGGGAAACAAAC 
• ^^numr-p-oTrT* B.aT , r , TRTr&Taa.r:aTf5TTA.C3GTATGGGG 



T AAAAAC AT G CAAAUAAA i r A>UU^AUft<wAAiJ loui 1 A^wwwrv*w™^ 

AGTTCCAGGTGCAGGGGCTCTAAATCTATCATAAGATGTTAGGTATGGGG 
GCTCTGCCGGACACAAACTCAAGGCTTTATGCTGTTATCTCTTGAGCGAA 
ATCCTGGGAACTTCGTACATTGCTTGCTTCAGTACCTTATCAGTTAATCG 
GACTCTTTGATATGTTGGGAGTCAGCGTACACAAGTTAACTCCTTGAGGA 
AGGGGGTGGGTAAGGAGTCCTTGATGTCTGGTAAATGAAGGAGCGAAATC 
GAGTTCCTCTGGCTTTCTCAGCTAAGGGAGAGCTTATTCATGTGGAAACA 




^TTC^ATCTTTAATCTCCCTCTAGCACCACCAGACTTTTTCTCTCTGTAC 
-TTGAGATG7AAATTTTGCTATCTGAATTTTCGTCTAAGAGTTGTTTCC7 
—AATA7GCAAATTTAGGGTTATTTAGCTGACAACTGCCAAAGTAG TGAA 
ACAAGTTATCAAGAACTTGAACGTCTAAGGTAGGAAAAAAAAAAGTCTTT 
ATGAATCTATAAGATGTACTTCTATTGGCATGCCTAATACGTCTATGTAT 
^TACGTGTTGTGTACACAGTTTTTCACTACTGAAAATATATAGAGGAGTT 
CTAATTAATTGACTTAAGACAATAAAAGCGCTTGAATCAAATACCTTATC 
AGGAAAAAGGAAAAGACAAGTCAAATGCTTGTTCAAGTCTATATAACTTA 
AGTAAAATCTTrAATAAATAAGCTAGCTTTAACATTATTTGAAATGTCTT 




GTGAGGGTTGGTGGGGTGGaNUtflxw 

CCCCCCCCCCCCCTCCCCCCCCGCCCCGNGCGGGCCGCGCCCCCCCCCGC 
CCCCCCGGCCCGCCCCCCGCGGCCCCCCACCCCCCCCCCCCCCCCCCCGC 
GCCCCGCCCCCCCCCCCGCGCCCCCCACCCCCCCGCCCCCCCGCCCCCCC 




GCCCCCvjv-l-wLl.(.i.i.>.^v.(.uiJtu\.i.u«i.u>.i.'--^^>.>.>.v- 

CCCCGCACCCCCGACCCCCCCCGCCGCCCCGCCCCCGCCCCCCCCCCCCG 
GCCCCCCCCCCGCCGGCGCGGCGCCCCACCCCCCCCCCCCAGCCCCGACC 
GCGCGCCCCCCCCACCCCCCCCCCAGCCCCCGCCCCCCGCCCCGACCC 

>ConciQ20 

GGCAGTACGCTATAATTCCCTCTTCACCTTACCTCATCTGTTCTCTGATG 
GATGTA Cl ' 1 ' 1 11 1 1 1 ITAGTTTCTAAATTCCCTTTTCCTTTGCTCTGGAG 
ATGGGTGATTGATGTAGTCTGGGTATTTGTTCCCTCCAAATCTCATGTTG 
AAATGTAATCCCCAGTGTTGGAGGTAGGGCCTGG7GGGAGGTGTTTGGAT 
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CATGGGGGCAGATCCC . .ATGAATAGC .'GGTACTGTCCTCTCAAiTAG . J: 

AATGAGTTCTCCTGAGATATGGTTGTTTAAAAGTGTGTGGCACTCCCCCA 

TTGCTCTCTTGTTACTGCTTTCGACATGTGACATCCCTGCTCCCCTTCGC 

rCTCTGCCATGATTGAAAGTTTCCTAAGGCTTCGCCAAAAGCTGAGCAGA 

TGTGGGTGCCATGCTTGTACAGCCTGCAGAACTGTGAGCCAAAATAAACT 

7 CATTTC CATATAAATTAC C CAGCCTCAGATATTT CTTTATAGCAACATA 

AGAGTGGCTTAATACAGGCTGGGCATGGTGGCTCACGCCTGTAATCCCAG 

CACTGTGGGAGGCTGAGGGGGGTGGAACATGAGGTCAGGAGATTGAGACC 

ACCGGCTAACACGGTGAAACTCCATCTCTACTAAAAATACAAAAAATTAG 

7CGGGCG7GGTGGTGGGCGCCTGTAGTCCCAGCTACTCTGGAGGCTGAGG 

CAGGAGAATGGCATGAACCCGGGAAGCGGAGCTTGCAGTGAGCCGAGATT 

GCACCACTGCACTCCAGCCTGGGCGACAAGAGTGAAACTCCATTTAAAAA 

GAAAAAACAAAATTTCAAACAGAACAAAATGAAAAAAATACCAAGTGAAA 

GGCCCCTATAAAAACCCCTCTGGGGCCCATCCTCCCACCCCCTCAAGTGA 

AAC CACATTTAACAATTTGGTGCATATCTTTCCAAACCTTTTGTTGTACA 

CATATAAAAAACATACATGCTTTGATTTGGCTCAGACTGTACATAGTGTT 

TTCCCTCTTGCATTTTACACTTAATATATCTTTGACATCTTTCTATGTCA 

GTGCATGTTGGCTCGATGATATTCTATCATTAAATACCCTTCCAAAAATG 

GTAAAAT CATTTTAAAAAATCATTCACACAAGTACATATTTACAATTTTA 

AAAGAAAACAGAATCCCAAAACACAACGACAAACCTCTAAAAATAATCTC 

TATCTTTCCACCAGCATGGAACAGTTCATTCCTTTTTCACATAAAACGAA 

TTATGTGATTGGAAAGATTAACTCTAATCTACACATTTATATACAGAATG 

TTCTATTTGTTAAGCCTATCTGAAAATAAAAAATTCAGATGATTAATTCA 

CTTACACTTAGAAATTAAGTCAATATACTATGAATACACATTGTGATCAG 

TTATAATATGATGCTTCTTAGTCTAGGGTTTCAATTAAATAACAGTAAAA 

AAAATTGGATAAATAAGACAGCTAATAACTGAAAAAT CCAGAAATTCAAA 

GATTATATTGCCAACTAAAACACTGCCATTTACATTTTTTTTTCCTACTT 

GGTAG CAAATGCTAATGGAATTCAATC CTGATTACTTAAAGTCAGTTCAC 

ATCACACATTCAATCAGGATAATACGAACATAATATGCCTACTATAGCGT 

TAGATTAAGACATAAAATTTTTTTGCTTGAAAGTAATGACTGCGTACCAC 

TTGAGACATTTGTCAACCACTTCAGCACATTGTTTACGAGTGACTGGATG 

TCCACAAGGAATAAAAACGACAGCAATATTTCTATCCATACAGATTTTGC 

AAAGCTTCTCCTCTTGCAGGTGTCTTAGCTGCTCTTCAGTACTAATCTCT 

TTCTGCAATGAAGTCTGACTTGATTCGTCTTGTGTACTGTCTTTCTGAGC 

CTTCACTGGATCTGCAATCAGAACCTCAAGTGATTTACAGTTGCTCCCAG 

ATGTCTGAATTTTTTCCTCCATTATTTTCTTAATGTCTTTGAAACTGAAC 

CCCATTCATATAGCTTCTTGTACCATAGGATTATGGAAGATGGTATCAAT 

TTTTCTAGTTAGTGATGGCGTTTTTTCAGCAGTTCTTACCAGACACTCCT 

ZAAGTGAATGGGATAAATGAATATTGTTTATATATTTTCGTGTCTTCTGT 

T CTAACAGATATT7ACACC CTGGATGC CATTAACATGTTGTC C CAAGGGT 

CTTNCTGGGCT 

>Concig21 

CTTTCTCCCTTTTTACCCCCATTTTCGTAGGGATTTGGTTAAAACCCATG 
TAAAAAATCCAAACACCGGCGGGGAACGGGGGTTCAAGCTCGTATCCCCA 
CCACTTTGGGAACCCAAGGTGGCAGGATTGTCGGAAGCCAGGCATTTGAG 
C C CAC CCTTGGGAAAAAAAAGAGAACCC CCATTTTT TTTGAACAAAAACC 
CCAAC CCTCCCAGGAAAGAAATAAGTATGGCTGGGTTGAAGTCACCAAAG 
ATGGCCGACTGGCTGGTCAAGTAACTTTACCTGATGGTTCGTAGAATATT 
TACCTTCACCCAGGTGGGAGAATTGCTTGAGCCAACCCTCAGTGTGGATT 
CAGGAACTTGATTTAATTGGTATCGTGATTGTGGATTAGATTCTCAGGGA 
TGCATTCACTAAGTAAAAGTGATAATAGCTACTTTTAAGTAAAATAATGA 
ATGAATCAAACACTCTAAATCCATGGTGCTATGCTAAGCTCTTTCTGTAT 
TrTATCTCATTTGATATTACAAATATTTGATGTGTTAATAGTAATGACTA 
TCTCCATTTTTACAAGTAAGGAAACTGACATTGAGAGATTAAAAGACTAG 
CACAAATCACAAAGTAAATGAGATTTGAATCCGGTCTTGATTCCAAACTC 
T ACAGT ATTCTAAATTCAAGGAGACTAAATTATAAGATGGAGAGC CAATT 
TTACTTTATAACAGGGTTAGAATGGCAGAAGAGACCTGACATTCACACCT 
CTAGCCAGTGCATCATCTTCCTGTAGGCAAATATGCAGGAAATCTATAAT 
AAGAACGTCCTTTGGTGAAGGCCAGGTGCAGGGGCTTACACTTGTAATTC 
CAG CACTTTGGGAGGT CAAGGTGGGAGGGTC GCTTGATGACAGGAGTTTG 
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AGAACAGCCTGGGCAACAXAGTGAGACw^TGTCTCTACAAACAA^OU 

ACACAAAACAACTTCAAGAAAACTCCTTTGGTATGGATCAGAACAAGATG 

AATTATCTATCTGATCCAAATGCTTAATGACATTAAGCCACAGTCCACTC 

ACTGCCACAATAGAGATATAC CTGCCAATGCCACTCAGGTAATCC CATCA 

AAAGTGGTAATGAGGTCTGCAGCATGACTTGTTCTTAGTGATCCCAGCCT 

GAGACC77GAGATTGCAGCATTTTATTCTACATATGCACAAAACATCTGT 

rGAAAAATCTTCTAAATTGATGCAATACATTCGTATCAAGAATACCTGTC 

rGTAATCTCCATAAACCCTCTCCTTTCTGTTTTAAAAAATAGTAACAGCA 

TTTCrCCTTACATGACAAAGAAATGACTTCACCATCTACGAAATAGTGAA 

TAGGAGCTGTGTGGAAGGAAATTAGCTCTACTTCTTGGTGGAGATGAGAA 

GGGAG7GTTCCTCTGAAAATCAAGGCTCTTGTCATGCTAGGAGCCAAAGT 

CGTTTTTTAGAGTGTGGACAGTTGAGAAGATAAGACAGGGACCATCCACT 

CATGTTTTTCTTATTCCATAGGCCTCTCTCAATTGGGCAAAGCACTCCAG 

ACCTT7TGGAAGAGTGACACCAAAGGCAAGCACCTGCTTGGCAGGCCCCT 

CAGCTTCTACGCAAGTATAAGTGAGTATATAAAATGGGGGTACTTGTGCT 

GTTGAGTACCTTATTTCCAAATGAGGCCTGCCGGTGTCCCTGTGGCTGTG 

AGAAGGCCTCTACTGGATAGGTGGAAGTTGTGTGTTCTCATCTTTTCTAA 

CCCTGGATTGACTTGCCCAAAAGGAAGCCATTATTAACACTATAATAAAA 

CCATCCTTAATCTGGGACTCTCTTCATGCAGTGGTTCTTAACCAGTGATA 

AACATGAGAGTTACTTTTGGAGCTTAAAAAAATTAAGATGCTCAAGGTCT 

ACCCAAACTGACTGAATCTCCAGAGGTGAGGCCCAGGGATGTATACTTTT 

GAGCCAGACCTCAGTTTACCCTGCAGAGCTCATAAGGTTGCATAACACCC 

TTTGTCAGCCACTCTGATGAAAAGAAAAATTGGTGAGGAATAAGTTTTAG 

AGAAGAAGGAGCAAAGGTGTTCTTGGCCAGTGAGAGCCAATGACAGGGAA 

ATGCAAACAATGTATCCACAAGAAAGGTAAATTACCCTATAGAGCATTTT 

AGGATAAATGAACATCTCATGCCTAGGGTTGAGAGAGGGTACAAAAAAAA 

AAAAAAAAAAGAC CACTCTGGATACACAACGCGATAAATGGAATAAAGAA 

TTTTTTC CTTGTAAATrAAAAAAATCCTTTGTTACTGAGGTATAATTTAA 

TCTATTTTATGTATAGTTCAATGAGGTGTTATAGATAATAAATTTTTTTT 

GTAAATTATTATATTGTCATATACTCATACATTCATTTTTAAAAGTCAGA 

AATGTATATAACCATTAAACTTATAAATCATTCAGTCATTCAGAGATATA 

GAT A CA C G AGCAT ATTTT ATAT C CAC CACAATAATTATT AC CAT CT CAAC 

AATTCCATCACCCCTCAAATTTCAAGCGTAGGGGTTTTTAAATGTCAAAG 

GAGTCTACTCAGTGGGAAGAAAGTTAAGGAAAAAACCTTTGGGGCTTTGG 

GCTC CTTC C C CCTGGGGTTAAAAAGGCAGGAAATTGGGCTTACCCC CCCT 

GAAATTGGGAACTGAAATTTTGGGAAGTTTAAAAAAAAAAAAA 

>contig22 

TCAAGCAG C CTTCCTT C CTTGGCTT CCCAAATTGTTGGGATTACAGGCAT 
3 AGT CAGG ATT C CTGGCTTAGTTTACATTTTCTAGAGTTTTGTATAAATG 
GAAACATACAGAATGTATTTTT TT GCGGAGTGGGGGAGTGTTTCTATTTC 
TTTCTTTCCATTTTCGCCCCCCCNCCCCCCCGAGACGGAGTCTCGCTCTG 
TCTGTTGCCCAGGCTGGAGTGCAGTGGTGCGATCTCGGCTCACCGCAAGC 
TCCACCTCCCGGGTTCAAGCAATTCTCCTGCCTCAGCCTCCTGAGTAGCT 
GGGATTACAGGCGCCCGCCACCACACCTGGCTAATT TriTr TGTA r i rr ' r 
GGTAGAGACGGGGTTTCACCATGTTAGCCAOGATGGTCTCGATCTCCTGA 
CCTCGTGATCTGCCCGCTTCGGCCTCCCTAAGTGCTGGGATTACAGGCGT 
GAGCCACCGTGCCCGGCCCAAGTGTTTCTATTTCTTAACCAGCTTTCATG 
CAATCTTTTTTTATTTTACCATCTCTGTGATCCCACTCCCAAAGGTACTA 
GATGTCGATTGGTCCTTAGGATCAGCTACCATTTGCCCAACT G CTTTCCA 
GCCTTCCAAAAA Tlll ' ll ' l \l'l''rTrrr r CTTAAAGATACTCCTGTGTGAGG 
CTCAGAACTCTTGAATTGCTACTGCAAATATGAACTCGGTGATGTGAATG 
CCAGGGAATTGCCTGATTGATCAAAGAAATGTATCCCCTTCTCCCTCACT 
CTTGCTGTCTTCTCATTTGTTTTCCCCATCCTTGTGGATTCGTGAATTTA 
AATATCCCTTTAATGTTATAATATTTTAATGGCGTTTGGCGAAAAGTACA 
GAATTAGGTGCAAGAGTGCATAGCTGTTATTTTTTTTTTGGCCTCTGAGA 
CTGTTCATATATGCAAGTTATTTAACAGAAAGTTCTGCAGTGACCTGAGA 
TGTCAGGGGGGTCTGATAGAGTACGTTTGAAGGCAGTTACTGGAAAAAAA 
TAATGCCATTTCTGGTTTGTACTTCGGTAAGTTCAGATGACCCAATATAT 
TGTTTACATGTGGCATTCAGTAAAAAAGTAGCTTCCCCTCCCTTTCTTCT 
TCCTTTTCTCCTTTCCTGCTTCTATAAAGCATCTGCTTTGGGAAACTTCT 
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-AGGAGGAGAGCTrSCC^jCCCGTGGi. .ATGGAGAGGTCTTGCAfiAGA- : 

AAAAGAGATGCTCCCACTCAATGCAGGATGG7GTGGAGGTAAATGGGGAT 

ACGTC7GGCATCACTCAGGAATGGGCCTTCCTGGCAGGGAAAAAAAGGGA 

GGGGAAAGAGGAAGGGAATTCNNANATNAATTGCTGAATACGGGGATTCC 

ATGG C CTGGATC CAGGAAGAGAACTT7GGGAGGTGTGAACCTGGAAGGCA 

-CANCTGATGAGGAGCAGCCTGAACTCCGGGGAGGACCTGTTTTTGGTGG 

-~C3GAAAAAAATGCCTTCCACACACAGGGAGGCCACCCGGCTGATGGGC 

^Sggggttggacggacagccctaggacaggcttgggaaaccaggctcagg 

"AGGGCCTGCGAGGTTCTCGCTGCGTCTCTTTCCTTCCTGGTCTTAGAAA 
ATAGAAT C C AAGGC CTCTTGAGAGTGGAAGGTGGGTTGGGAGGAGGGCAG 
ATGGGGCTTAGGCCCAGGACACCCGTAGAGCTACTGCCCAGCTGTCTCTC 
AGGGACTCTGCTGAGGTCACTCCAAGGATCATTCTTAGCCTTGCTAGACA 
GTACTGACAGAGGGAACCGTAGTATCGCACCCACTTCCTTCTCTTTCAAT 

gaaagtt:aaaggtcaccatttcctctggcaaaggaagttccacaaatat 
^ccatttccggtcttagaaacagcaaggtatcaagcaattgcaaacttcc 
■^•gtgctggggaattcccaaggaagtaggggcagagttctggtggagacaa 
agtgaattccgagtgattagtcagtagcagtagcagtagcagtagcagta 




AGTAT""~TTATTGGCAGGAAAAATAAAATAGCTATGAGTGAAATAATTCA 
-^AGACCTGAGCCTCCATCAATTTTGTGTTTAAAGGCCTGACTCTCTTTA 
Ac—- CCCTGGGATGGAAGATGCAAATGTTCCTGATCTCACTGTCAAAAA 

Xgaagaaccagtgggtatattgtatgcttgagttccagccattagtcaca 

AGACATAGAGATGACTGCCATGTGTGTAGACTTTCTATAGACTGTGTGCT 
AAACCCGACCTGCCACTTCCAAGGAGTAGATGAGGAATGTCCATGGTTC7 
GGGGAGCCCTACCCCAATTTGGGGCAGACATTCCAAAGCTCATTTTCTGT 
GGAGGGGGTTGATGGTTAAAGGAACGGCTGGGATTTACTCTTCTTTCTAG 
GGCCAAGAAAATGACATGCTGCCTCCATGTTTAATCATCCTTCCCCCTGT 
TAATAACTATGGCTTTAAGTCCCCGGTTAGGGCCTTCCTCCAAAATTGGG 
GAAAAAAATTCCCCTCCCCCCCTAAAAATTTT TTTTTTA AAAAAACCTTT 
■fTTTTTGGGGGTTGGGAAAAAAACCAAAAATTTTTmCCCCAGGGGTTT 
tttia TTTAAATTTCTCCCCAAAAATTTGTTTTTTTTTTTCCGCG AAAAA 
AAGAC C CCC C CAAAAAAAAAAAGTTTTTTGGCGGAAAAAAAAAT ATTTTT 
TTTGT GTT AAG AAAT GGAGAAG AAGGGGGGTTTTTTTTTT CTTCTCCCCC 
CACCCGCCAAAGGAAAGGTTGTTCACAGATTGTTTTGTGTCTCCCGCCCA 

ATGTGCCTGCGAAATCATCCTTCCAGAAATATTTGCCCCTTTCTTTTGTT 
MAGAGTGGCACTGCCCTATATGGTGACCACTTGCCACATGTGGCTGT.G 
AACAC--GAAATTGGCTTGTCAGAATTGCAGTGTAAAGTGTAAAACACAT 
ACCAAATTTCAAAGACATGGCACATAATAAAAAATGTAAAATATCTCATT 
AACAATT*"rTATATTGACTGTGTAAGTAACATTTTGAATATATTGGATTA 
AATACATGGATGATGCCCCAACACCCACAGTCCCTTATCAAGTCTCTACT 
TCACATTTTTGTACTTCTGACTTAGAAATAGCACTGGCGTCTAAGAGCCT 
ATTAATGTCGTCAATAGGTTCTTGGGAACCACAATTTTAAACAAAATGAC 

atataagaaaacgaataacattgaacaaaatgacattattcgaggacctg 
"tgcatgttgtttcacttaaagtcagtgtccaagaaactatcagtgacat 
ttagtgaggaattgctgtccttcctgtttacaggaacctgggcaagttac 
ttaattcctctaagcccggtttatatccctgcaaagagagaaggataata 




AAATGGCTGACAGTGTCCTTGTCACA^uaaija imuium ^ v -~"*~~ 
"TGCTATTGTCTGCCTCACTTCACTAGTAATGGTCCAGGGAGGCCTTTAA 

-GTGCATGGTGCAGTACATTCACATGTTGGACATGGGTGAAGGGAAACSAC 

^GGCTCATCTAAACACAATAGGATGCTTGTGGTGTTTTGAGGAGGAATC 

\CTAGTTATCCACAGCTGTAACATGCATGGATCAAAAGAGATAAGG 

-» « » ./..^wrwr ^rt'ra.nrAiAru'A.TTACAAAATGCAGAGACCAG 



AAGGA 




rTGTGGG7GGTGGTGAGTCAGAUui-A<j<_ UUtiLiui^i „*r^rA 
GGTTCTGGGCAAGTCACGCCATCTGTCTTGATGCCCTTCCCCATCTATAG 

AGAGGGAGCAACTGAGGCCCCTTCCAATACTGAAGTCCT^ATTTCTGCT 

AC*— ■AGAAATATCCACATTTTTGGTAAArrCAAATGATCCAATGATTCC 
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ATTTCCTAATGTTCAAAAv- f AGCCCCAu AACATCTAAATGAATCAMAilr* 
AATAAAATATTTATTGTGTATGTTTTGATTGCTGAAACTTCTATTTTAGC 
AACACACACACACACACACAGAACCCATAAGCCT7CATCTTTCCTTGGAT 
AAACGAGCCT TCCT GTCTGGCCATTTAAGTCACGATTAAGTAAATGATT"- 
CCAACTCGCCTTTTGCAGCAGTTCAGATGGGTCTTTCCTGCGTGGCAGTG 
GCCCTCCTGACTTATGATTTCCTGTGTGTCGGCCTGTTACCACTGCAGCT 

T^^^^^^ ca ^^ gc ^ ctgacccc:a agagactgttgg 

AGGCAAAGG^.TCAGTCCCAAGAACCTCACACGTGGGGAGCCCGAGAGCC 

cagccctgaccttttctccagtaataacataagaaacaacaggcactggc 

CTTATTTTGGATACAAAGAGTGGTGCTTTTCCTTAAATCTTCCTTTAGTC 
AGGGGTACCCCTTCATGGACGCCCCAACATCCATGGTTCCTGCTTGAGTC 
CCTGCTTCCATATTCCTGCACTTCTCACTTGAAATATCCCTGGAGTACGT 
TAAGCAGCCAGGTTTGGAAGTTCTTGCTGTGCAGGCGGGTGTGTGCATGT 




GGCCCAGGCCATGATGGGAAGCACTCTCTGCCTAAAGACTAGGGTGATGC 
GCCCTCAACTG7GGGAATGAGCCCCAGCTCTGGTGTCTGCCTCGGTTTTT 
CCTCCTGGACAATCAACATGAACTCCTCACCCCTCTTATCCACTTTGCAT 
AAACTGAAAATAACAAACCCAGGGTCTTTCTGTCACAGGAAAGGGTTTTT 
TTTTATAAGATTAAACAGAGATGATTCAACACACCCAGGATATAACACAT 
GGGCCATGAGTCAAGGCCAGGCATTGCTCTGGTCAGCCTGTTGT7TGGGC 
CC CCTTGGCAGGGCTCTCCCCTGAATCTTCCCCCTCTTGACTCCCCATCA 
CCACAGCACGTCCAGCTTTGGGTACAAGGCCAGTAAATGGGGAAGGGGGT 
CAGATGACATAAAGAGCCCTTTCCTGTCCCATTGAAATATATTTGGATAA 
CAGATGGCATTTCCCCCTGTGTCTTGCCCAGGGCCCAGAGCCTCCACTTG 
CTAGAGGCAGACAGAGGATGGAGAGCCCCTTCATTAGTGGGAGGACATCA 
CAGGTGGGCAAGAAACCACAAGCTTGCACTGAGGCCCAGCCTTGAAATAG 
CAGCACCTGCCGGCACCTGTGGTCTGGGGACAGGG7CACAGGATGGAGGG 
GCCTCCTAAGCCTTTTATCTCTATGTACTAAGTACAACCCATTTTCCCAC 

CTCACAGAGCCAGATCAGCCTCTGTGAGGTCCTGGTGGCAAAAGGATAAT 

TGCCTGCCCGCCTGrrrrrrrviTnrvvw » 




CTGCAAACCTCAAACCAACTTCAAAGAAGATCCAGCACC 
>Contig24 

CGCG TAGT CTAAAGACTGAGTCTGAAGCTGTCCCTTCCTGCTATGGACTT 
CAGATTTTAGCCCACTTGAATTGCTCCATATCCTCCAAGCCATGGCCATC 
CCTTGACTCTCTGGGCTCCCAAGCACTTGCTGCCTTCATCACACAGTTTG 
AGTTAAGGCAGAAAGACTGGTTTCCATGTACACTrrGTGGAAGCT^CTC 
ATTTCTTTA7ATAATCTCTGTCCTTTGTCTACTGCTTTAAAATCTAGAAA 
TTGTTTACAAACACAAAGGTGATCCTTTAAAAGC7CAAAGCTGATTGTGT 
CACCAATATATACCACTCTTAATGGCTTCCCATTAAACTTTGAGTAAAGA 
ZTTTATGGAGCCTACATAAGGCCATGACTACCTGGCTCTTATTTTCCTCC 
TCATCCTC^TCTCACCAACTCACTCTCCACTCCTATACCCCTCACTCCTT 




CACTGAACTGTCCACTCCCAGTCTAAGACATGTGCTTATGTCACACGCCC 
TTACCGTGCTTATCTCAGTTTGTAATTATCTACTCATTTAGAAAAGTGTT 
GATGAAGGTCTTCACTGTCAGCTTTCAGGATAGCAGGAATCATAGCTGAT 
TTTACTTACTTAACGGGGTTTCATTCTTTGTAA C TT T ITmTlTTT GAG 




C CGAGT AGCTGGGATTACAGATGC CTGTCAC CACGC C CAGCTAATTTTTT 
GTATTTTTTGTAAAGACGGGGTTTCATCATGTTGGCCAGGCTGGTCTCGA 
TCTCCTGACCTCAGGCGATCCACCCACCTCAGCCTCCCAAAGTGCTGTGA 
TTACAGGCATGAGCCACGGCACCCAGCCACTCCTT7TTTACTTATGGGTG 
AGAAGCCATTAGAGATCATTTCTTCTTTTCTTTCTCTCTTCACTAAGGCA 
C CAGG GTCACTAAGTAGTAGGATACTTTGAACTAGAACTCAAGAAATTGA 
GTTTTAAT77TACCTCACACTCTCATATGAATTCTCCATGTGACCTCGGG 
CCATACTTCCCCTGTACCCTGTTTCCTCTTTTATAAAAGTAAGAGTTTAA 
ACTAGATGGTCTCCGACATGCATCCrTCTCTAACATATTCTGGAACCTTC 
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^; iC- , — . -GGACCCACAAGCTCCAATCTTATAAAACATATCCAGTs. 

\CC AACATAGATCATCTCACCTTGCATAT CTGATTTTGTGGATCAT 
GGGGAAAAACTGCTGA^CCTAGCAAAACCCATGGCATAGGATAAGTGU 




^ItWttaagtaaggacgccgg^^ 
aacat: 



^ \GAAAAACAT ATTGTAGGAAACCATTCTTTTTAAAAAAAAAAGTCTCw . 
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AAAAAAAGAAGGATTTAAACATAAGACCTAAAAC^ 

. tagttcaaccattgtc^gacagtgtggcaattcctcaaggatc-Sa 

aggattagaaatcattctattgtaaagaovcatgcacatgtatgtttjvt^ 
g^gcactattcacaatagcaaac^cttg<^^SSaatSSS 

^T^TAGACTGTGTAAAAAAATGTGGACC^A^ 

g»gccataaaaaagaatgagttcattc^S?gg^ 

^^^catcattctcagovaac^^^ 

ctgcatgttctcactcataact(3ggagttcsaac^^ 

acagggaggragaatgtcacacaccagggc^^g^gctggss? a a 

3ncTTc^zoz»Tcccauaiutf^^ 



^^i^^AAC^GTTCAGCTTTTATTGCATCTTGCTGT^ 
;T A ^:;:^I^£CCmAAAAGGGTGGGAAATAGTAATGCCCTCCTT 
: AA ^;Z"-^? A I^^TTACCCTCAGATTCTGTTTGTTG 
* -«TGw iAACAATGTTTTCTTAAAGGTCTAAAGGAAATGTTTTCT 

FIG. 3 (14 of 52) 

ft fur 



„„„„ PCT/US98/16102 
WO 99/06426 

" "CAACATAATATTCTGl ijCATTGCAGAAGGTCTTTTCTTTTGCCTTTTtj 
^TAACTGGCTTAACAGATTTTATGTTTTATTGAAATAATTTCTATGCCAT 
"AT^ATTAAGTTTrGGTTTGCTTAGAAAACACTGAGATTAATACAATTTT 
~^AAAAATTATGATTATTACATCCATATATCTTTATG7ATGTGCTTTTAA 

A.GT C ~~TG7GACATTGAGTTCT AGGGCTTGACT CCTGGGT CTTAAAAGGA 
CAAGTCCTGCTAAATCTTAAATACTGACAGCAATTAAAGGCTCATCTTCA 
GGACTGGTAGAAAATGCCAATCAAAATAAACTGCATTCTTGAAACACAGA 
GCCAGAAATTAAAGCTATTCAACTCAAGGCCCAGGAACTATAGTGGAAGA 
GGTGGGTGTGTGAGATTGTAAGGGCCAATTTTGAGAGATAAAATAAGTTC 
AATT-C-C-ATAAATTAATCATAATCATTGATGTCCAAGCCACACTGATG 
CAAGATCAGCATATGGGTCCTGTGTCAGATTAACAAGGTTTTCTTGAAGC 



CAAGATCAGCATATGGGTCCTli 1 vj 1 lau*i nwv-w^i. * * 
ATTAACC-ACTCCTTAATAAAGGTTATAGAGGTTATAAAAGGCTTCTGGA 

AGTTATAGCTATGGTCAAGATAAAAATTTCATAGATTGTTAATACAATTT 




TTGGAAAATTAAGTCTCGTCTCT(^AAA^AAiUi«*«vj^i. i. " — ; 

'i^--^^ TT ^ TC CTTGAGTTATCAC'rTTGGTCAAATGAATGACTTA 

^T^ACAATGACCTTrCATCAAGTGTTTrAAACCTrTCAAATTTGACAAA 
CTCTCCAAAATCAAACTACAAATTATGTCTTTTTATGACCTAATGAATCC 

-GTG^^ATTTGGTATAAAAATTrrACAAGAAACATTGTCAAATATAAA 
ATA^GTGTGGTTTTGTTTGGGCTGTATTTGTATAAATATGTTATTGGTA 
^rTG-TCCAAAATTATAGGAAACTCCTATAATTCTGATATGACTTGGTGT 

-ar^*"AACAAAT rw "CCTCATCAAGTGTGTCTTTGACTATGGTTGCCCTAA 
G^T^CACAGACAATTGTCTTGCTTTGGTCCTCTTTAGAAG 
GTGGT^TATAATCAGCTATAAAACTCTAACGGGTGCTCTTGAATGCAGG 
C^AAGATAGCTTTGGAGACTGTGACATCAGAATAGAGGAAAAACTTTCA 

gS^SSgagSctgaaatattcatc^tatcaagc^ 

AACTTCATAGATGGAACTAAAAGAATGCTGAAGTAATCTTTTTGACTTTT 
TTTC^AGAATGTTGATCCTrCGTTTTGTTTTTCAGAGTCNAGGA^TTT 

tctgttgagatattga^ 

CAATTTGGAGCA 

ATCTAGTCATrCCCCAGCCTGACCAATTCAATGGCCCCCATCTTAGTTAA 
ilSc-TCACCCTGACAAGGCCCCATCTACGCCTCTGACCTCATGCCCTC 

SSc-CAGTC^G^ScCCTGCCACACTCAAGGGCTT^^ 

^X^GAT-CCACCGATAGCTCAGGGACTTrGCACATGCTACGGTCT 

"^GC^^GGCTCCTCCCCAGATCTTCTCATGCCTAGCTGCTTCTCATCAGC 

--^^^ AGAGACTGT CCCTGCCCCACCTCTCCAGG7TCCATACCTGCCA 

^£X~r^^^AATCACGTAACAGTTTCTTCACAGAGCGAGTTACCATCCCA 

-^AmCC^CCTATTmGTGACTGGTCTGTTGCCTGTCTCCACCA 
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^^GAAGCCAGCCTGTTCTGAGC^CACAC^?^???;^^ 



ATAAr-~aTrT-T-rr£» » XTZ: 1 1 * aATTACATATAAACTGGA 
V^^^ TCT ^ TrC ^^ CA ^^GTAGGCGGAAAACAATACAAGAC 



^?A^f^SIfI^^ AT ^" ACC ^ CT ^ GGT ATTCTG 



>Concig27 



> A t^?^^ GC ^^ < ^ CT( ^TAAGC(^ACTGAGATCAGAGA 
^I* AAT ^ CC ^^ T ^^ C ^ G ^ C ^GCTGGGAGCCAAGTC 
- wAAA^ x vjvjGGCCTGCTGCTTCCAGTTCCCCTCTCTCTGCATTGATGTCA 
--^ATTATCCrTCGTCCCAGTCCTGTCTCCACTACCACTrTCCCCCTCAAA 
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^CAGATCAGAAGGAAGGCAATAC^GAAGCAGAAAGTCCAATGA^a^ 
t^ScAGCAGTCATGGGGGTGGGGTAAGGAAAAGAAGTGGAAAGAAACA 
GACAGAATTGGGTTATATTTTGGAGATAGAACCAACAGAAGGAAGAGGAG 
AAAC^CATTTACTGAGAAGGGAAAAAGTAGGAGAGGAATAGGT^GGGA 
AAT AAATC CTGCTGACATTGGAAACCCCAAGGAAGCCTCAAAAGTATATT 
^C^-GC-^AGATTTAAAAGAATAGGAAAGAAGCATCTCAACTrGGAAT 
TTGAAATC~ATTTTTCCATAAAAGTATrGTTAAATTCTACTCATACTCAC 
* i ari-ra raTTCTAAAGAGTATATTGAAAGAGTTTACTGATATACTT 



acttgtcSgtgttgacagttcatatgtaccatgtacatgaagaagctaa 

at^t^actgtSgtcatttgctgagcatgtantgagcc^gtaactct 

aaatgaatgtttacactctttgtaagagtgc^ccaacactaac^ 

^?g™Saaa(^caccctatattttgcatagtaccaa^ 

aSatta^ctt<^taacaattttaggagga(:cagagctact 

V^T^^i » » » R/ii^r^arrriTATTACAGATGGGCAAATTAAGGCATAAG 



GGGAGAC 
AATTG 

TGTTI* 
GAAC 
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>Concig28 

CCCACAGTAAGCTCTGQATAGTAGAASTTATAGSSarrr^^^^^ 



-^GCAAGT7CA(K:CAAGTGCGCCTTGTCATTGTCAGTCAT^GGTGAT 
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jTGTuATTGGTCAAACAaTTAGTTTTGCTCAGCATCTCGTGTGTTTTCAA 

^GGACCTGAGGGTTCATTTGCCCATGCAGATCTTGTAGTCCTGTTTATTC 

~ATTAA7TTA7CTTGCAAATCTATAATGTTTTATTTTAAGCAGCGAGAGC 

'-3TGGCAGCCTTTGGTCTGGACCCTTTCTAATGATCATTTAGTATCAGGC 

TATGTGGGAGTTGATTGTTTTGCATTGCCTGAAAGCCAACAGTATCACTC 

r T C C"*CT AGGTGTGGCAGAGATGTGAGAGAGGGAGACTGACAGT CTGTGG 

GvjTGTATGCAGTGTTGGGGGAAGCGAGGCACAGGGGACAATACTGTGGT 

3TATAAAACTAGTCTAAGGTAGCATCAGGAAGTTCATGAAGCCAAAATGA 

TTT C AT AAC AG CACAAGACATTATTTGTTTTTGC CTC CCTCT CATTTTT 

t^T— -TTTTTGAGACAGAGTCTTGCTCTGTCATCCATGCTCGTGTGCAGT 

GGTGCAATCTCGGCTCACTGCAACCTCCACCTCCAGGGTTCAAGCAATTC 

-CATGCCTCAGCCTCCTGAGTAGCTGATTACAGGTCTGCACCACCCCGCC 

GGCTAGTTTTTGTATTTTTAGTAGAGATGGGGTTTTGTAATGTTGGCCAG 

GCTGC CCTGTCAT T TTmT T T ACTAGTGTCCAGTGGAGTTTTTTAGGGG 

CTACATAACATGATACTGTCATTAATCTAATGGCTAATGAAAGGGATATG 

TATATGTTTTrGTGTTTAAAACAAACTTCTTTGGGGTCCTCAATAATTTT 

TAAGAGTATAAAGGGGTCCTGAGATCAAAGAGTTTGAGTTCTGCTGGACT 

GGGACAGTGGTTGTCAACCCAGATTGTACATTAGGGTCATCTGGGAAGCT 

^TAAAATAGTACTGATGCCCAACCTTACCGCAAACCAATTAAGCCAGAAT 

CTCTGTGGATGAGAAGTCTTCATTGTCATCATCACCATGACCATCATCAT 

TGTCACCGTCACTACACCATTATCATCATCATCATATCATCTTCATTATC 

ATTGTTAGTATCTCCATCACCATCATCAGCATCACCATTATTATCATCAT 

CATCATCCCCACCATCATCCTCATCGGAACTTCACCTGCATGGAGGACAA 

-CCACTATGCATTAGGTGCTATGCTATTTGCTATACTCCTTATTCTCACA 

ACTGCCCAGAGAGGCTGATATTATCTCACTTTATAACAGGAGGAATCTGG 




ATTGAGAACCTCrTTAGTTATGCTTTCACCCCTCCAACACCACAGTAAAT 
TTTTTCTTTTTTTAAAAAAATTATACTTTAAGTTATAGGGTATATGTGCA 
TAATGTGCAGGTTTGTTACATATGTATACATGTGCCATGTTGGTGTGCTG 
CACTCATTAACTCGTCATTTACATTAGGTATATCTTCTAATGCTATCCCT 
CCCCGCTCTCCCCACCCCATGACAGGCCCTGGTGTGTGATGTTCCCCACC 




CCAGC — CATCCACGTCCCTACAAAGGATATGAACTCATCCTTTTTTATG 
GCTG C ATAGT ATTCCATGGTGTATGTGTGC CACATTTTCTTAAT C CAGTC 
-ATCAT~GCTGGACATTTGGGTTGGTTCCAAG7CTTTGC7ATTGTGAATA 
GTGCCACAGTGAACATTCATGTGCATGTGTCTrTATAGCAGCATGATTTA 

~AAT GGGTATATACCCAGTAATGGGATGGCTGGGTCAAATGGTAT 

^C^AGT^CTAGATCCTTGAGGAATTGCCACACrGTCTACCACAATGGTT 
GAATTAGTTTATAGCCCCACCAACAGTGTAAAAGCATTCCTATTTCTCCA 
CATCC^^TCCAGCACCTGTTCrTTCGTGACTTTTTAGTGATTGCCATTCT 
AACTGGCACCACAGTAAATTTTrATAGATTrrATAAGCAAATTGTATTTA 
CTGTGCAAGAATTGGTTTATTTTTTAAACCATGTGTTGCAAACATACAAT 
GGTTAATTGTGATATTTGCTCACTACAAGATCATCAGATCACTACACAGA 




GCCTTAATGTGGTrAACTATGTAATTTrTTTCTGACTTTTTGAAATACTG 

AGAAGAGCTCATGACTCTCCCATCTCCTAATTCTACCTTGGTGGATTTTA 

GACTGACCACAACTCATGGGTAAATGAGGGAAGACGAATAAGAAACCTTG 

C-nTT^rrCCTC Cr r U TlTTT G GCTGGCTGCAGTGGCTCACACCTGTAA 

^CTCATCACTTrGGGAGGCCAAGGTGGGAAGATCACTTGAGCTCAGGATT 

TCAAAACTGGCCTGGGCAACATAGTGAGACCCCATCTCTAAAAAAAAAAA 

AAAAAAAAAAAAAGGCGACAG^GGTGCGTGCCTGTAATCCTACCTACTC 

AAGAAGCCGAGGTGGAAAGATCACTTGAGCATGGGAGGTCAAAGCTGCAG 

T GAACCTTGATTGCACCACTTCATTCCAGCCTGGGTGACAAAGCAGGACG 

CTGCCTCAAGAAAACAAAAACAAAACCTTAATTTrTrGGCTATTCTTTTC 

'GG7AAGAATGGTATAGAGATGGGGATGAGGATGGCTATTGTATGAGAGA 

GCAAACAGGGTCCAAGCAGTGCTCTGGGCTGTCTAAGGACCAGTAGTCAG 

CTTAACTTCTCAAATTTCCAGGGAAGGAGTTCGGAGTGGTAGAATATCCT 



FIG. 3 (19 of 52) 



WO 99/06426 



PCT/US98/16102 



GGGTATGCCCAAAGCATt^CCrraCAAATAGCCTaTCATGAATAATTTS.- 
TTCA77TGTTATGACTGGAAACTGGCTTTGTGTATGCCAGAGAATGGGGG 
^ G S^^~^"^ TGTCT ^ GCTCTCTGTGCCT ^GGGGCAGTGA 



GAGAGCAATGTAACCC v_.. J>A u» 1Ai .<_ ( _. . 

>Concic29 

G7GAAC7CG7777ACC7G7G7AGCAGACCAAGCCGCAGACAAAA7CCOTC 
AGACACC^AATTAAAGAAGGAAGGGCTTTATTGGGCCTGGAGCTGCGGCA 
AGACTCACGTCTCCAACAACCGAGCTCCCCGAGTGTGCAATTCCTGTCC'* 




GAATGGAACAGAACATGACAGGGATCTTCACAGTGCTTTTCTTATGCAAA 

T ^ c 5^ T H A ^ T ^ GGGGTCGATCm ^ 




ACTCAAAACAAACAAA GAAAA AAACAGGGAGTTGTACAGCAAAATAAAC-'- 
T77GA77 - - GACCAAATTTTGGGAGATCAGGAATTCTCTGAAGGAGATGC 
TTTCAGACCTCAGCAAATTGTCCTGTTGG7TTGAGCCATAAAGTTAGC""* 
A7GC7GG7ACCAAACACCAG7AGGAGA777G7CAAAGG7AAGAGGCATC 




""""^ * - • a awwuvj i 1 1 A i rrTGCCACGGTTGAGGACAC 

CCACCCATGACAGAGCATCAGGAGGTCCTGACCACATGTGCTCAGGGTGG 
TCTGAGCACAGCTTGGTTTTACACATTTTAGGGAGACATGAGACATCAGT 
GAATATATGTAAGATGTACACTGGTTCCCTCCAGAAAGGCAGAACAACTr 
GAAGCAGGGAGGGAGCTTCCAGGTCACAGGTAGGTGAGAGACAAACAATT 
GCATTCTTCTGAGTGTCTGATTAGCCTTTCCAAAGGAGGCAATCAGATAT 
GCATTTATCACAGTGAGCAGAGGGGTGACTTTGAATAGAATGGGAGGCAG 
GTTTGCCCTAAGCAGTTCCCAGCTTGACTTTTCCCTTTAGCTTAGTGATT 
7GGAGGCCCCAAGATTTATTTTCCTTCTACATCACTGTGGGCAGCTGACT 
AGGAAAGC777GTAGGACTGGTGGGCAGTGTGAGAGCCCAGTGGGGGG7G 
G7GG7CC7G7GCCAA7GG7AGCAACCACC7G7GAGGC7GAG7AAAC7CA7 
77CCCAACC7CC7C7AGCAGCCCCAG7GGAGA7ACAGAGGAAGCAGAC7A 
GCGA7ACAACCCAGCC7GAAG77T7G7C7GGTGAG7G7AA7GGAA7AAAA 
A7GGGAAGGG7GC7GAAGAGACCAGCAAGAAAA7GG77GAAGAGA7GGGG 
CACAGAAA77AAGC7GGATCAAAAAGGACGGAAAAGCAGAAAGGGCCGA7 
AGAGAGAGGGGATATCTATGGGTTCGCGATTCTGAAAAGGACAAATCAC7 
GG7GCTTTGAGAAGAGAGAGGG7GAGAAAGCAGGAAGGC7GGAGGC7G7C 




GTAC7CAAGA7AATAAGTTCCCAGAAGAGAAGTACCCATATTTAATTCAT 
C7G7G7CT7CC7AGCAG7AC7AAAAA7A77A7A7GAAAGG7A7CAAACC7 
77GAGAA7G7GTGCTGCTAAA7TGT7AAGGATGCTGGAAAACTCAAGACG 
7CCC7GA7CC7GAGCCTGAGTATGAGCCTGTQ3TGAGCCCAATGCAGG7C 




7GC777A7GAG7AAACGTTACCAGGGAACACTT7GCAGTCTTAACACACA 
7GCCCACC7G7GACCAC7GATCCCTGTTGGG7GACCACTGACATCAGAGA 
77CGA7GGCAGCAATGAAGACAAGGC7ATCCTCATTAGGAAGGAAAGGAA 
GGAGGAGGGAGGAGGGCAAACGAA7CTTTCC7GCTT3TCAACCACGTCCA 
7C7C7G77AGG7GA7T7CCCATG7GTGACTTTG7TTATCT7TATAATAAC 
7CTGAGAGG7AGG7CTTGA7G7CCACA77T7GAACA7GAGGACATCCAGC 
CAGGAAG77GAG77CTGGGGACATAGCTGAGAGGGCAAAGCTACATATAA 
ACCCC7C777G777TT7C7GGC7TATCCACrGAG7GCCCCCTGCAATCCA 
CCAGCCCA777G7GAAG7GCA7AC7A7AGG7AAG77GGCACAGGAGGAG7 



r 
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3GA7GTGGGCGAT777G L ,IACAGC7C7CCAGGAACT7ACACAC7GGTGAij' 
3AGGGCCAGG7ATG7TCC7GACCAG7CACAATCAAAGCAACC7 CC7AC7 A 
AT C AGGGAGGCTTGGTAC CTGGGGAATG C7 A7G77GAAAGG77C777TCT 
GGGTTTTAAAATGATGGG7CTATTTCCTTATTCTTAAGATTGCTTTTTTT 
CTGGCTAGAACT7AAAAGAAAT77TCAGTAAAA77TCCCTTCCCTGGCAC 
AAAG7GAGC7TGAAATGAATTCCCAGG7GGCC7TGATACT77AAAATA77 
3CCTCCTATAAAA7CAACCT7TAGAAGAAGGAAG7CAAAGAACA7GC7AG 
A.TTTCACAAAGG77AATTCCTTGAAATCCAG77A7CTACAGGACAATG77 
GT CAAAG AAAAAATT ATTTGGCCAGGGACGGCGGCTCATGC CTATAATC C 
CAGCACT77GGGAGGC7GAGGCAGG7GA7CACC7GAGG7CAGGAG77CGA 
GACCAGCC7GGCCAACA7GGTGAAACCCCATC7C7AC7AAAAA7ACAAAA 
AAAATTAGCCAGG7G7GG7GG7GGGCACC7G7AA7CCCAGC7ACACGGGA 
GGCTGAGGCAGGAGAA7CGCTTGAACCCGGGAGGAGGAAG77GCAG7GAG 
CCAAG77CAAGCCAC7GCACCCCAGCC7GGGCAACAGAGCAAGAC777G7 
CXCCAAAAAAAAAAAAAAT7CAATGATA7T7T7AAATTCATGGTAAGGAA 
GA7"*7CA77CAGAACCAGCACAGAAGATA7AGGAAACAC7GCAA7GGGAC 
TTTGCGG7GGGGGAGAGAGA7TGAACACAACTACATA7ACAGCACGGGCA 
AGGACATA7TCATAGCCAGGAAGCAGAGCAAAGATCAG7GGA7GCGAAA7 
TAC7AAGAGGAAACA7GAAAAA7AAGGGAGC7TCTGCCTAAACCCACC7A 
ACCGGATCCTTGC7GAAGACAGGACAGGGTGAT7GGACACCAC77TGGGG 
VrGG7GGAGGA7GGGGAA7CCAG7GAGA7T7CAAGGG7GATGCGAIA7TG 
AACATACAAAGTTCTrGC7AAAAAAGGAT77TACAAGAAAG7G7ACAAAT 
G7GCGTGGGACAAGG7GCAGGAGCCCGACGGAGATG7GG7CCAGCAGAGA 
ATATG7GCCGAGA7GA7AGG7GAG77CTC7GACGAAGGA7A7A7GC7GAT 
CCAGCCAGGG7GAAATGCTCAGAGAAAGCACGGAGGGGC7A7G7CCG77G 
CCCCAG7CTCCACGCGG7CAAA7CTGATCCCG7TG7GAG7G7GGCCG77T 
GTAGAAAGCAA7CAGGGGGGG7CCC7CCCC 
>Concig30 



>concigju 

AATA7A7A7T7777A7ANNA7N7GAGACAGG77C7CAC7AGG77GCCCAG 
GCTGG7CT7GAA77CC7GCCTTCAAG7GACTCTCCCACCT7AGCC7ACTG 




CAGTCTCTCC7ATCC7CCTTTAA7CAG7TrrGC7ATCCTG7CAA77CCCC 

CAACGAGTGTGA7G77G7GAAA7A7ATAT7TG77C77CA7C7CC7G777C 

CTGACA7ACAGC7777AAAAACCCTTGGAA7C7C7GGAA7AATAAGAG7u 

— r— •GCA7GC7AA7AGATGAC7GC7GGC7GGCAGCCCCAATGCAG7AG 

C~-CATGATGGGG77TGTCACAGGAAAGACCAAGGCAGGA77GGAGAC7T 

GAGAC~G' r ""'AGCCCCAC7CCCCAACCAC7GGAGGGAG7GGAGGGGC7GAA 

GG77G7G7CAG7CACCAATGGCCAA7GG7TCGG7CAA7CA7GTG7A7G7A 

A7AAAGCCAC7C7TAAAAACCCAAAAAGGACAGGG7TTGGAAGGGC7CCC 

AGA7AGC7GGACACA7GAAGGTTCC7GGAGGG7GGTGCCCCAGAGGGGCA 

TGGAAGCTC CACACC C C77C7CACA7GC7T7GCTCTGCGCA7C7C77CAT 

CrGGTG77CA7C7GTATCCTTTGTAATATCTTTTAGAA7AAACTGG7AAA 

C77AAG7G7T77CCTGAGTTCTGTGAGC7GCTC7AGCAAA77CACGGAAC 

C C G AGGGAAGCAAAC C CAGATTTAT AGC CA7CAG7CAGAAGCA7AGG7GA 

CAACC7ACCAC77GTAAC7GGCACC7GAAG7GGGAGGCAG7CT7G7GAGA 

CTGAGCCCTCAACCTGTGGGATCTAACGCTAACTCCAGGTAGATAGTGTT 




TG77GAGG7G77G7GACAGTATGG7AGGGAAAAC7GCGTC7GG77Trr7C 
CTrTTACAATCAGT7AAATA7TTAACACAAGTCrACTG7A7ATTAG7AAA 
AGGG77ACA7Tr7TrAA7G7CTrGACAG7TGCACTTTGACAACTTCCATA 
TCAA7CAC7T77777CG7G7CCGTrTGGAACCAAAA7CACTrGGGA7ACC 
A7GAACCAGGC7GCAGCG7A77CCCCAGGCC77GAAAGC77GGAGGCCAT 
7TTGCCAGCCN7AATCCC7G7GAATACCAGGCTTCG7GGAT7TAAAAAAT 
AGAC77GAGGCCAGGCC7GGTGGC7CACACC7G7AAGCCCAGCAC777GG 
GAGGCAGAGGCGGA7AGA7CACAAGG77AGGAG77CGAGACCAGCG7Gvjw 
CAACATGG7GAAACCCCG7C7C7AC7AAA7ATACAAAAAAAAA77AGCwsj 
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JCACTGG 

~* imT* r ^^cc. cci\. xACTCCCCAAACCCTG&GTrA 



^ A^-^rr; w * ^ * 1 CLAGCC CCTCCTCTACTCCCCAAArrrrr 



, tsc _.^ at _ ^^^'i^tpttttttttttgaaatggagtctcactg 
tgacccacctgcctcagactcccaaagtcSgggaS^a^SSS 

£S»^^ ccctgtctc taaaaaaaaaatacaaaaaaaattaaaa^ot 




^^^i^^^ m ^^ a ^^ G ^ CGA AATGTTACAC 




I? AiS 1 * wTGGATCCATCTCTT AAGAAAGCGCTTTGCCTCCCCGGTTATT 
.AGuTGATAAGAAGTGTCCTAGATGACAGCCCTGT.aaTrtrt^Tv-«a^./-« 
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CAACAC cr CTCACTAAAGAGAAAGAAATAAAAAAGAAAATTAAAATCTGC 

Z GCAATGC CCAGACAGTCATTGAATAACTGCATGTGTACAGCACTTOTTT 

ACTTTTACATACTTCATATTTTAGCCTTCATAGCAGCTCACAGGGGTGGA 

TTTAATTTTT AGTCCAACT C CTGTCACGGTGC CTGGCACAAGTATAATAA 

ATGTTCTGTGAATAAATGACC CT CI ' T T TT A GATGAGGAAATCGAGGCTCA 

AGGAGAACAAGCAATGTAATGTCCCCCTCCTGTTCAGCCATCTGCCTTTC 

ACGCCACTGAATGCAGTAGTCCTCAGTCCCCTGAACrTGACCCTCTTCTG 

CTTTTCGGACTGGTCCTTCTAATCCCGTTGTGACTCACTACACCACCTCT 

C CTG CATATGACATCTACATTTTAAAACAAAC CGTATGGAAATAACACAT 

TAGTCGGCTTGTTCCCCCACCCCCGCAAAAAAAAAGGCCTCTTTATAACA 

GAAACTTCTCAGGCTGGTAGGGGAATTTTATTCCCCCATTTATGGTAGAA 

AGGC C C7AACCTTGGACCTCACGC CATAGCTATTCACATGGGGGAATGAT 

GAATAACATGGGGAGCAGCATGTAAATATCATTGAGCCGTAGTCCAGACC 

TATAACACATC 

>Concig31 

GGGGGAGCTGCATGTGCCTGTCGAGATCTGGGGGAGGAACASGAAGATCA 

AGAGTT CTGTGT AGGACATGTTAAGTTGAAGGTGCTTACAGGATAGC CAG 

ATGAAGCATCAGGTGTGCAGTCAAAGATATGAGTCTGGAGCAGCACATCC 

T AAGTCACCTCCTGCACCAACACAGAACTTCCAGGCCACTCACTTGAGCT 

CTCCCAAATAGTTTCCAAGTGTCATTATGTTAATAACCTATGAGCTTGAA 

CACCAGATTCAAACCCCACTGCATGGCTTTTAAAGACCATCTCAAGGGCT 

^GACACTC CAGGGAGCCAACTAAAGATGCCTGGTCCTACCATCAACCT C C 

AC C C GATTTTTTATAGAAAATGTTTCTACCTGTCCTAAGGCAGGGTCCTG 

C C C GACT C C CAGGCC C CTTTAGAT CCCCAATATTCCT C CTC CCTGAAC CA 

AAACCGTCATCATCTTCCAGCATGGGTGGGGCCTCCATTCTTGCTTCTGC 

TCCCCTGAGCAGAAGCAAGTTTCTCCCAACTTGACCTGATTCTCCTCCTA 

AGTACCAGTCACTGCTTTGTTTCTGGAATGAGAGAAAAAGACAGAGTGAG 

AGAGACAATCCAGAACTCTTGCTCACTCACAGCTAGGCTGGGCATCTGGG 

AGGATGGCTGTGTCCATGGGAACCTGGGAAAAGCCACACCCTTGGCACCC 

TGGTCACCCACCTGTCTCCCTGGCAGATTCCGCACTGCTCTCTTGCACCC 

TCTACCAGGGCTAACCGGCCTGCTCACTCTCCCCAGCATGTCTTCCCACG 

CCCACTCTCTAATTATTACATTCCCTTCACATAAACTGCCCTTCTCTCCC 

AATCACCACATGTTCACTTCCCACCCAGCTGTCAAAGTCTGGCTCAACCT 

CATTCTTGAAAAGGAAAAAACAAACAAACAAACAAACAAACAAGCAAAAA 

ACCT ATG AT GGATTAAGAACACACTTCATTC CAGGAACATGCTT ATCTCC 

^CTAACTCTCACAACAACTACAGCAGGTAGGTGTTATCACACCCATCTCT 

CAGGTGAGAAAACAGGCTCAACGAGTGCAGGAGGACACAGCAAGTCAGTG 

ACAAAGCTTAAATTCAAGCCCAAGCCTGTTGGCAACCAACGTCTGTACCC 

-TGATAGCTACCTCATTTACCACCAAATCCAGTGGCCTCAGGCCTGGCTG 

CACACTGGGATCACCTGGTGCCCAGACCACATCTTAGACCAGTCATACAG 

AATCTCTTGGGCTGGGATCCTCCACGGTACATTTTAAGGGTCCCCAGGTG 

AGTTCCACCATGGACCCAGAATTGAGGACCCAATACCGTATACCATCTCC 

TTCTTCATCTCTTCTAAGGCATCTCTTACTCGCTGTGCACTCCCATACCA 

r TTT GTT C AAT CATCCAAT CATTCATT CATTGAGTCAGTT AGT CAGGAGC 

TACTCACTAGTCCCCTGCCAGGTCCTAGTCAXGACATAGGGCTCTGGGGA 

CCAACAAGAAGCAGGACCCATGCCTCCTGCTCTCATGGAGCTTGCTCTGC 

AGCAGAGGAAGCAGTCAGTGAGATGTAGCAAATGTGAAATGTGCACAGAT 

GGGAAAAGCAAAACTTTAAAACTTTTAGGACAAAATACACAAGAAATCTT 

TGCAACTTTGGGACAGGAAGGAACAACATTCCTTACACATGACACCAAAG 

GAATCAACCATAAATAAAAAGGTGATCAATTTGACCTCATTTAAGTGTTA 

AGCTTTTTTCATTGAGAGACACCATTAAAAATTAAAAATACATGCCACAA 

ACTGGGATACAATATTTACAACACTTATGTCTCACAAAGGATTAGTTTTC 

AGAATATATAAAGAACTCCCGGCCGGGTATGGCCGCGCACGCTGGAATCT 

CAGCACfTGGGAGGCCAGCGGATCACATGAGGTCAGGAGTTCAAGACCA 

GCCTGGCCAACATGGCAAAACTCCGTCTCTACTAAAAATACAAAAATTAG 

C CAGGCA7GGTGGCGGGCG CCTGTAATCC CAGCTACTCAGGAAACTGAGG 

CAGGAGAATCACTTGAGCCCAGAAAACAGAAGTTGCAGTGAGCTGAGCTC 

ACATCACTG7AAGCCTCGGTGACAGAGTAAGAC7GTCAAAAAAACGAAAA 

~ \AAAACAAAAACTC CTACAAATAAAT AAGAAAAAAAT AGC C CAGCAGGA 
^^GTATATACATTTCATAAAAGAATAAATACATTCTGTCAGTTTTCTA 
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^~I^^^^^^^^AAATACAAATGGTTAGGAAACATTTTTTAAA 
GCTGAAGGGAAGGAGGGTCriTTCAGATar.^™i;™E^E^ 




^VVXX. ZZ" - - -^-^^UAAACAAAGCAGAAGGTGAGGAGAATAC-aAT 

TGAGTGGGTCTGGTGAGGAGAAACAAGTAGAGTACAAAGTTAAAGGAGAG 

AAAGAGCAAGGTCCCTGAGAACACGGGCCTTCTGTTTAAACCC^GCCAG 
CTACAAGTTAAAGAGCATGflTrrrrn »m »n« ^^^ZZ:„_„ ZVzi 




* '^^^AGAAAATGCTATCATTATTATTACAATTTTAT 
™TAAAGGATACAAATCAGAAGGACTAGCC^ 
AGAGGACTAGTAAAAAACAC»GCTTCTGr G Trr;^^^™™5 




--"^X*2?p^pi^^t A i^^ iT ^CTCAATCTCCAGTGTCCCTC 

^AAC * >. * AATTACCTTTTTGGTCTTTTCAGGGACTAGACCCCATCCTGAA 

^^^^^^^^^^^^^^^^AACATAAC^AGACAC 
^ATATTACTCAGAAAATTCCAACAGTTra^ 

CTGGGACATAGATCAAATT C 1 1 1 1 1 1 1 1 1 1 1 Til L rSxS SSSsOT 

cttgctgtgttgccca«^agagtgca^g^cagaS?a^^^ 

AGCTTCAACTTCCCAGGCTTAAGTGACCT^ScCT^ 

^^EZS^^^^^^^^^^^^cctggcctctgtgatggttaatac 
I g ^I g I g ^ gggtg ^ gccaaa ^^ttacatttgagtcagt 

GAAAGTCCACCCTTTCCCAGT<^CTC5GGAC»CCCACCCTCAAT^ 

aaacacaatctaatcagctgccagtgtggtcagaataaaaggaggcagaa 

GAACAGG<3AAACACTAGACTGGCTTACrr^ 




tgcagatggccaattgtgggactttaacttgtgatcatgtgagtcaatat 
*.A. i^iAiAi -atatacagatatataatatcctattatatacagatata 
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T AATATC CTATT ATAThCAGGTATATATATATATATGTAT CATATATAXA 

TAT C CT ATTGGTT CT AT C C CT CTTG AGAAT C CTGACT AATACAGCCTC C C 

AAAATGCTGAGATTACAGGAGTGAGCCACAGCCACCATGCCCAGCCCCAA 

ATT CTTAATTATACAACAATGGGT C CAGAGAT CAGGGC CTGGGT AGGAT G 

CAGCAATAAGAAAACAGATGGTGGATGGGGACACATGTTGGAAGTGTGGC 

AGGACAT GGCTGAGGGAACT CATAGGATGGTGTCTATTTTCATGGCTGAG 

TGTGAGGAACAGCATAAGGTCAAAATTTCAGGTCAATGGTGAGTTTTTTA 

j^ATTGTTGCTGTGAACCCCAAAAATCTGACCCAGGTCTCAGTTAATTTAG 

AAAGTCTATTTTTCCAAGGTTGAGAACACCCACCCACTCACGACAAGAGC 

AT CAGGAGGT C CTGACCACATGTGC C CAAGGTGGTAAGAGCACAGCTTGG 

TTTTATATATTTTAGGGAGACGTAAGTCATCAATCAATATATGTAAGATG 

TACACTGGTTCTGCCTAGAAAGGCAGGACAACTTGAAGCAGGGAGGGGGC 

TTCCATGTCACAGGTAGGTGAGAGACAAACAGTTGCATTCTTTGAGTTTC 

TGATTATCCTTTCCAAAGGAGGCAATCAGATGTGCAATTATCTCAGTGAG 

CAGAGGGATGACTTTGAATAGAAAGACAGGCAGGTTTGCCCTAAGAAGTT 

CCCAGCTTGACTTTTrCCTTTAGCTTTGTGATTTGGAGGCGCCAAGATTT 

ATTTTCCTTTCACATTTCCCCCCCTTTCTTTTTAAGAATCTTTTAAAGAA 

AGCTTTTAAAAAGAAAATGAGTCTCTGGTC C CAGGTTTCATCTGAATTCT 

CGAGGGGAGGATGGTTTATCCTAAACGGGTGGTTCTGAATTTTGAGAAAG 

TGCATTGTAC 
>Contig32 

AAAAGC CAT ACGAATG AGGAAGAATT AAGGGC C AGAACAAAACAAGAAG A 
~ G AGGGAAAGTTTGGAACTT CTT AGAGACTGG CTAAATGGTTGTGACCAA 
AAT G CTGATAGTGATACGGACAATGAAGTC CAGGGTGACAAAGTCTCAGA 
TGGAAATGGGGAATTTGTTGGGAACTGGGCAAAGGTCACCCTTGCTATGA 
CTCAGCAAAGAAATTGGGTGCATTGTGTTCATGTCCTGGGGATCTGTGGA 
AGTTTGAATGTAAGAGTGATGACTTACGGTAGGGTATCTAGTGGAAGAAA 
CCTCTAAGCAACAAAGTGTGTTGCTTAGAAATTTCrrTCTTTCTTTTTTT 
TTTTTTTTTGAGCTGGAGTTTTGCTGTGTCGCCCAGGCTGGAGCGCAGTG 
GCGCAATCTTGGCTCACTTCAAGCTCTGTCTCCTGGGTTCATGCCATTCT 
CCTGCCTCAGCCTCCCAAGTAGCTGGGACTACAGGCGCCTGCCACCATAC 
CTGGCTAATTTTTTAGTATTTTAGTAGAGACGAGGTTTCAC CATGTTAGC 
CAAGATGGTCTCAATCTTCTGACCTCGTGATCCACCCGCCTTGGCCTCCC 
AAAATGCTGGGGTTACAAGCATGAGCCACCC CGCCTGGCCTGCTTAGAAA 
TTTCTAAGCCAGGATATGGCCTGTCTGCTTCTAACAGCCTGTGCTCAGGG 
GTAAAGAAATGACTTAAAGTTGGAACCTATGTTTAAAATGGAAGTAGAGT 
^TAAAAATTTGGAAAATTTGCAGCCTGGC CTTGTGGCAGAGAAAGAATC C 

aagt aggct g cagagcaatcattgctagagagattagcatgactaaaagg 

z agc caagtgctaat attcaagacaatgttaaaaaggccttgagggcatt 

^cagagatctatgaagcagcccctcccatcacaggtgcagaggtttggtg 

cactaggcccagaggttttatgggccanngccagggccacactgctatgc 

acagctttgggacactgctgcccgcatccaggccactctgctctggctcc 

acccttggctcaaacgggccaagatagagcttggaccactgctcccgagg 

gcacaagccataagccttggtggto:ccatgtggtgttaagcctgcaggt 

gcccagaatgcaagattgagggagcttgggcacttccacctaaatttcag 

aggatgtgtcagaaaccctaggttcccaggcagaagcatgatacaggggc 

agagcccttgcagagaacctctactagggcaatgccaaaggaaaatgtgg 

ggttggagtcctcacacatggtccccactggggcactacctggtgatact 

gtgggaatggggctgctgccctccagaccccagaatggtagatgcactgg 

cagctggcaccctgagcctggaaaagctgcaggcactcaactccaaccca 

tgagat cagccacatgggctactcccagggaagcccacagaggcagggct 

gtctaaggccttgggagcctaccc cttgaac cagcttgcaggacatggaa 

t caaagattatgttgcagctttaaggcttaatgttttccctgtcaatttc 

AGGCTT GTGTGGGAC CTGTTG CTTTi.T l'T '1 1 1 1 1 1 1 1 1 1 1 1 1 1 TTTTGGT 
CACAGGTGTTTGAAC CAGAACAATTCCATCTTGAATAGGGGCTGGGTAAA 
ATAAGGCTGAGACCTACTGAGCTGCATTCCTAGGAGGTTAGGAATTCTAA 



AAAATGGTGATATGGTTTGGCTCTATGTCCCCACCCAAATCTCATCTCAA 
ATTATAATTCCCATAATC CC CACATGTTGAGGGGAGGACCTGGTTGGAGG 
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^I-C^GGCAATTTCCCCCATGCTGTTCTGGTOATACTGAG 
•^^i^;^J;;;CT-CTGCAGCTTTATGAAGAAGGTACTTGCT 
-^-^^-^^CCCGTTTTCTTTATACATTACCAGTCTTGGG 

^ C ~-Zt J^^CCC^CAGTCATGCCTGTGCCGGTCTCAG^ACC? 
tuftCAGi - - ACAGA x GGCATAGCAACGTCTAAAAGGTACCP "a ra ttit' a ^ 

t^caaggggaggaaccctcagctctgS^agtc^ 

iAGCwAT 4 C . -TATTCCGTTACTTTCTTAATAAAATTGrr^&rTTTar 

tgtatgtactcgcctggaattctttcttgtacgaggtccaSgccct 

>.TTC . vjvjAATGGGAATGTTTACACAATGACTGTATCACTT^TGAATrTTr 
GAAGTAAATAATTTGTTTTTGACTTTACAGCCTCA^GG 
TGACTTGAATTTCAGATGAGACTTTGGACTTTGGGACT^^GGGTTGGGG 
CTGGAATGAGTTAAAAGTTGGGGGGATTATTGGGAAGGCACGATTTl^TT 

^^;-^ g ^ agggctgct ggaaaatgtttggattacaaggctg 

uACnx . -^AiAAGACGTGTAGGNCCC 
?Con.cic33 

^S^^f^^I^EI^GGCTCCTG-TGACCACTATTATAGCACC 
^ GCAGC 1^^ TAGACC1 ^ A TGACACCTAACCAACCCCATTAT7TACAA 
tSctatgSg?^^^^^ QCGTTTATTATTTrTTAGAGATGGGGGCT 



X^i^X^ii^^i^^ iiAU CCACTGGCTTCTGGCTGAGGCCTTACG 
^IcEr'-ri^^^^^^CCTTCTTCTACAGAAGACACACCTTG 
^iCi^—ZZ^^^^^^^^^^^^^^^TCCTTGGTAGAGGGCT 
^S^-^^^^^^^^^^^CTCTCCTGAAGCTCCATC 

ACTTCCCTCTTCAGAAAAGATGGCTCATTTTCCCTCAAW 

gctgccaagtattctgccaattcaccctcsgagcaoatcaa^ 
tctccaaatctagccccttgacttcggatttcacgatttctcccttcctc 

CTAG^CTTGATAAGTTTCCCGCGCTTCCCTTTTTCTAAGACTACATGT 
^ CATCrrATAAAGCAAAGGGGTGAATAAA TGAACCAAMoSSScT 
^GGAATATCTGCAAACAACAATAATATOVCKTATGCCATOTC^ 
T^TAGCCAGTATCGAGTTGAATGAACATA(3AAAAATACAAAACTGA^ 
CT.CCCrGTAAATTCCCCGTTTTGACGACGCACTTGTAGCCACGTAGCCA 

CvjCC - acttaagacaattacaaaaggcgaagaagactgactcaggcttaa 

^E5 CCAGC ^ GAGAGGGAST ^TTTCATTGGCGTTTGAGTCAGO^AGG 
i.ATTGTCCTCACATCTCTGGCTATTAAAGTATTTTCTGTTGTTGTTTTTC 
^^GGCTGTTTTCTCTCACATTGCCTTCTCTAAAGCTACAGCCTCTCC 
TC ^ GTCCCTCCCTGG ^ GGTATGTGA CCTAGAATTACAGTC 
AGATT - CAGAAAATGATTCTCTCATTTTGCTGATAAGGACTGATTCGTTT 
TACTGAGGGACGGCAGAACTAGrrrcCTATGAGGGCATGGGTGAATACAA 
- - j AGGCTTC7 CATGGGAGGGAATCTCTACTATC CAAAATTATTAGGAGA 

AAATTGAAAATTTCCAACTCTGTCTCTCTCTTACCTCTGTGTAAGGCAAA 
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~AC C— ^TTCTTGTGGTG'-rrrrrGTAACCTCTTCAAACTTTCATTGATTG 
AATGCCTC-TTCTGGCAATACATTAGGTTGGGCACAIAAGGAATACCAACA 

"taaataaaacattctaaaagaagtttacgatctaataaaggagacaggta 
-atagcaaactaattcaaaggagctagaagatggagaaaatgctgaatgt 
ggactaagtcattcaacaaagttttcaggaagcacaaagaggaggggctc 
cctcacagatatctggattagaggctggctgagctgatggtggctggtg 
~~r?c" gt7 g c aaaagtcaagatggc caaagttc cagacatgt7tgaaga 
ctgaagaactgttacaggtaaggaataagatttatctcttgtgatttaa 

"■GAGGGTTTCAAGGCTCACCAAAA7CCAGCTAGGCATAACAGTGGCCAGC 

ATGGGGGCAGGCCGGCAGAGGT7GTAAAGATG7GTACTAGTCCTGAAGTC 

AGAGCAGGTTCAGAGAAGACCCAGAAAAACTAAGCATTCAGCATGTTAAA 

'-TGAGATTACATTGGCAGGGAGACCGCCATTTTAGAAAAATTATTTTTGA 

GGTC7GC7GAGCCCTACATGAATA7CAGCATCAAC7TAGACACAGCC7CT 

GTTGAGATCACATGCCCTGATATAAGAATGGGT7TTACTGGTCCATTCTC 

AGGAAAACTTGATCTCATTCAGGAACAGGAAATGGCTCCACAGCAAGCTG 

GGCA7G7GAACTCACATATGCAGGCAAA7CTCACTCAGATGTAGAAGAAA 

GGTAAATGAACACAAAGATAAAATTACGGAACATATTAAACTAACATGAT 

GTTTCGATTATCTGTAGTAAATACTAACACAAACTAGGCTGTCAAAATTT 

rGCCTGGATATTTTACTAAGTATAAATTATGAAATCTGTTTTAGTGAATA 

CATGAAAG7AA7GTGTAACATATAATCTATTTGG7TAAAATAAAAAGGAA 

GTGC~-CAAAACCTTrCTTTTCTCTAAAGGAGCTTAACATTCTTCCCTGA 

\CT^GAArTAAAGCTCTTCAATTTG7TAGCCAAGTCCAATTTTTACAGAT 

AAAGCACAGGTAAAGCTCAAAGCCTGTCTTGATGACTACTAATTCCAGAT 

-"AG7AAGA7ATGAA7TACTCTACCTATGTG7ATGTGTAGAAGTCCTTAAA 

TTTCAAAGATGACAGTAATGGCCATGTGTATGTGTGTGACCCACAACTAT 




CCATTGATCATCTGTCTCTGAATCAGGTAAGCAAATGACTGTAATTCTCA 
TGGGAC7GCTATTCTTACACAGTGGTTTCTTCATCCAAAGAGAACAGCAA 
TGACTTGAATCTrAAATACTTTrGTTTTACCCTCACTAGAGGTCCAGAGA 
CCTGTCrrrCATTATAAGTGAGACCAGCTGCCTCTCTAAACTAATAGTTG 
ATGTGCATTGGCTTCTCCCAGAACAGAGCAGAACTATCCCAAATCCCTGA 
GAACTGGAGTCTCCTGGGGCAGGCTTCATCAGGATGTTAGTTATGCCATC 
CTGAGAAAGGCCCCGCAGGCCGCTTCACCAGGTGTCTGTCTCCTAATGTG 
ATGTGTTGTGGTTGTCTTCTCTGACACCAGCATCAGAGGTTAGAGAAAGjl 
-T'-CAAACATGAAGCTGAGAGAGAGGAAGCAAGCCAGTTGAAAGTGAGAA 

GTC"" ACAGC c act catcaatctgtgttattgtgtttggagaccacaaata 

3ACACTATAAGTACTGCCTAG7ATGTCTTCAGTACTGGCT^AAAAGCTj 

--'^^aaaggagtattrctaaaatattttgagcattgttaagcagatttt 
-aacctcctgagagggaactaattggaaagctaccactcactacaatcat 

TGTTAACCTATTTAGTTACAACATCTCATTTTTGAGCATGCAAATAAATG 
AAAAATCTTCCTAAAAAAATCATCTTTTTATCCTGGAAGGAGGAAGGAAG 
GTGAGACAAAAGGGAGAGAGGGAGGGAAGCCTAATGAAACACCAGTTACC 
-AAGACCAGAATGGAGATCTTCCTCACTACCTCTGTTGAATACMCMC. 
ACTGAAAGAACTTTCATTCCCTGACCATGAACAGCCTCTCAGCTTCTCTT 




GGAAGGTTCTGAAGAAGAGAtiiti i 1 1 iAAuv-^^j-v-ww^----.- 
GATGACCTGGAGGCCATCGCCAATGACTCAGAGGAAGGTAAGGGGTCAAG 

CACAATAATATCTTTCTTTTACAGTTTTAAGCAAGTAG<K3ACAGTAGAAT 

TTAGGGGAAAATTAAACGTGGAGTCAGAATAACAAGAAGACAACCAAGCA 

TTAGTCTGGTAACTATACAGAGGAAAATTAATTTTTATCCTTCTCCAGGA 

GGGAGAAATGAGCAGTGGCCTGAATCGAGAATACjrGCTC^GCCATTA 

^C^AGCCATATTGTAAAGGTCGTGTGACTTTTAGCCTTTC^GGAGAA 




AAAATGGAGATAATCAGAATCCCCCACTCATTGGATTGTTGTAA^ 

AGAGTCTCAGGCTT-ACAGACTGAGCTAGCTGGGC^ 
iJ^GA77AAA7GAG7CAACA7CCCC7AACTTC7GGACTAGAATAATG i CT 
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-^-i^^E^^CTCACTGC^GCTCTGCCTCC^GGG? 

^PI^^CAGGCGTGAGCCACCGCGCCci 
rJSiP CTACTACCTATATGAATACTA CCA 



CT 
AC 
GTT 



rTACCTGAGTAAAAAAATCATGGTTTCTCCTCCATCC^A^^ 

^?S;^ T ^ C7ATAAA ^^ 

^^ G ^ TAACG ^ TG ^^CCTAGAAGG7TTTAAAATCT^ 
ATAAGAAAATAAAAATCACTCATATCGTCAGTGA^t^^^^ 



gcttttctgacttagcaatgtattgatgacattatgcatgcSagacctc 



>Conn=r34 



r A S5^--^iZ A ^^E^ OTCCT ^ GCTG ^ CCGGG ^ G TTGGCTGGA 
pE^^I-^-^EP^CCCAGAATCTTCAGCAAGCCAACTCAGGA^ 
5i^ G ^^r^ ^ TGT( -AA G 'I'CTAGGACCCAGGC7CTGGGTGACTGA7'^ 
GAG ^ A ^^ G "AAAATGAGGAAGTCTGAT^GTCTAAAGGT 



w w ' C CCCTAGGCCATTTGAATCATCTGCTAATTGGTTT^C^ 

.OA. . -CCACGGAAACTTCCTCCATCCCTTCCTCACATATCAGCCACAGA 

ATTwai -AC-uvji TCCACACCTCCTTTGTGGATAAGTTCAAGCTCC'GAG 
TGTGGCAAGCAGGGCCCACCTGGAATCCCCTGCCCTCCTCTCCTATCCCA 




CTCC 
CTTCACCC 



^.^uiuuiii uuAXAACCTCJTTCTTCCTGACTGGAAACTCCTT 

tc^tgtagtttggctw^ 

ACCCCCTCTATAACAAGTCCTTTCCCAAGCTGGGTGGTGGATGCTC 
GTGCTGTGTGAGTCTTGAAriTrr^ra/*™ * *or^,~~~^~LLZ 



GGACACCTTGCTCCCC7GGCTGGCTGGAAGCAGATGGTGTCCACCTGCTT 
C^TCGGAATTCTGCACCTTTAATAAAGTTTTATGGGACAG^ 
TGGCAT7GACATTGTAACGAGGAATGGGTGGTGCCACCTTTGCTGTGTCT 
rACCAGAAATACCTGTGGCAGGTAAATTTCTAGAGAGACCCTCCCATTTC 

i C G CATAT AGCAJV TTT*m^ A A Tr**r*T^r ^ ^*r^ * /■« — — — — ^. - . . 
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\CCAGGCATTCAGTTTCiAGGGGTGTGTGTGTGTGTGTGCGCGCGCGC, - - 
~GTGCATGAGTGCATGCGCGCGCGTGTACATAGGGGAAGGGAAACAAAAC 
AAAAGTACACAAGACATGATAGTTGTCCTCAAGGAGTTTTTGCAAATG7T 
CACAATT7AAGAGAATATGC7GTGCTGTGGCTGGTGTATAAACCAACTGC 
~AGGGAGAGGCCTTCCACACACACTTGGGGCAAATGCGACC7CTAGGACT 
GCCAGTGGAATCTGGGCATGCTGTTTGTGGTCGATAAACCCTGGTCCCTT 
GAT C AGGG AC CT ATGTTTACTTTTC CT CTC CCTGGAAGTCTT CATT AG7G 
GSC ATC CAGAAGGTCTTGCACAGGGCAGAGGGAGGCACAAAGACAAGAGT 
*^Q^^(^r^r:rr m rtn&r&&rR&&^Tcar:M^TTrTATgTTTACAAAAAAAAT 




TTTAAAAAAl i/\vjv.u^\jwia\jvjaa ^ uw*.lvjx www 
~ CAGG AAGCTGAGGCAGGAGGATT CC CTGAGAC CAGGAATTTTGAGGCTG 
CAGTGAGCTATTAAGTTGGCGCAAAAGTAATCGTGGTTTTTATCATTAAA 
AGTAATGGCAAAACTTTTAATGACAAAAACCGTGATTACTTTTGCACCAA 
TTTAATATGATTGCACGACTGCACTGTGCTCCAGCCTGGGCAACAGAGTG 
GGAC C CT GT CACAAAATAATAAATAAATAAAATGTAAACATGTAAAAAAA 
ACCCCAAAAACAAAAAAAATGGGTGTTGAGACCCCTGAATTGAGGAATAA 
^AGGAAGGAGTGTGATTCTGTGTGTGCATGCATGGGTGTGCACCCTCAGT 
GCCTGGGTGGCTTACCCTGGGCTAGTTCAGGTGGCAAATGGTTTTCCTCC 
AGCTGGGCTACCACCATCTTCCCCCAGGGCCTGTCCATGTATTTGGTGGC 

AAGATAC CTATGGACTAGAGTCC CTCCTCAGAGGAAAGG CT C CTCC CATT 

TCTCTGGCTTTCAGGTAGTAGTCCATGACTTCAACAGGTCCCCAGTGCAA 

-GTTATGGGTTAGTTTAGGTGGGGTCTCCTCTGAGAGCCTCCCATAGCCC 

AAAAGGCCCTGTCCTAGCTGGCACTGCATCTCCCTCTTCCCAGCTCTCAG 

cr ^^ CT ~^TGCTCATCCCACTCCGCACAGGw^^ 

GAT GT GTCAATC CTGCC CCTAAGGGATGCAAGGCAATTTGTCCTTTTATT 

- rrTfraflTfTTl 




GAACTTTGGTAQKiLuAAUU i A<jVjAL»MA1 iVjV«J.lvj/\o\-i.v_nw>j«w* 

GCTGTAGTGAACCATGATTGCACCATTGCATTCCAGCCTGTGTGACACAG 
CGAGACCCTGTCTTTTTTCTTTTTTTTTTTGAGACAGGGTCTCGCTCTGT 
CATCCAGGCTAGAGTGCAGCGGTGTTTTTCTGCTCACTGCAGCCTCAACC 
TGCACATTTTTTGTAGAGACGGTGTCTTGCTATGTTGCCCAGAGTGGCCT 
CAAACTCCTGGGCTCAAGAGATCTTTCCACCTCAGCCTTCCAAAGTGCTG 
GGACTACAGGCGTGAGCTACCGCGCCCAAC^AAGACCCTGTCTTAAAAAG 
» . . « p* » » * ftTa a KmrTrrrrrt.i.nrri " l-L 11 !"! lTTTT ' l 1 ' 11 IGAGACGG 




AGTCTCGCTCTGTUUv-Wt-AljOt. i(jv*imv»<j«-«vj a\j\j*-www»» - » » — >- - — 
-TGCAAGCTCTGCCTCCCGGGTTCACGCCATTCTCTTGCCTCAGCCTCCC 
3AGTAGCTGGGACTACAGGTGCCCGCCACCACGCCTGGCTAATATTTTGT 

TAGTAGAGATGGGGTTTCACTGCGTTAGCCAGGATGGTCTTGATC 

— C^CACCrTGTGATCCGCCCGCCTCGGCCTCCCAAAGTGCTGGGATTAC 
AGGCATGAGCCACCGCGCCCAGCCAGACCTCTTGAGTCTTAAACTCCTCT 
GTAGTTCCAGCCACCCTTTAGCACATGACTCTGTTAATTTTGTTCTCACr 




CCACTGCTTAATCAGAGTAGGTCCCTGTCAACTTAiiUAi i w»uyw 
CATGCCAGTGTGGATGATTAAAATTGTTGAGTGGAGGCTGATCAGATGAG 
C CATCTCCTTCCAAGTCCTCACTTGCTGGCTCC TGTCT TAGTTrTAGTCC 
CCATTCTTCAAAGAACGTGAGCCCTGGAAAGTATTTTAGTCATTTAGTTC 
AGTGCCTTTGGATGGGAGGATCACATCCCTGGGTCCCGTCCTGCAGACTG 
PTir^riftr. & TTrrrTGcrTrcrCTCACTTCGGCA 



AGCTGCAGrTGTTCCTGCCCACACTGGGCGTTCAGGTGGAQA^WK~tAi 
GGCACTACACTGAGCrCTCGGCGAAACCCAGGACTCTGAAATCTCGGTGT 
CAGCCACAGGCCACTCTTTTCAGCAGGACTTCAGTCAGTCCTGTCACTAG 

gctgtcgagcacatggtaggctttacccc 
Iaggagtgtgcttgctgatagcatgtgtgangggacgaggagtaaataat 

—CTGCCTTCAAGAAATTGCAAACTAGTAATGGAGATAAAATCAACAGAG 
3AACAA7TAGAG7ATAAGGTAAAATCTAAGGGCCATAAGAGAGGAGAAGA 
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^^^^^J^GGTTGGAGTCCCACCTCCGTGACTTCTAGC 

ATC. - -AAAGCTTGTTATGTAAAGTGCCTAGGACCTCGTAGGCATCAACa 
;t; G ^^55 AGTA ^ 7TG ^^TAGGAAAGGGCTCCAAGCACATCTAA 



PCT/US98/16102 



^™™^-^^GGCaVGACCCCGCCTCCTGGTACAGAGGACC 
ACG x G\j A -ACAATGACATCTCTTTGCAAGGCTGCTGCAGGGCTTTCTGGA 

aatacgcccgtgaggtatctgggcctgcgcacSctccIcSSSg^ 
cccagacgtctacctgggggtcccgtctgcgctcc^ 

ctcac , c -«tggcctcctacttccccggccgtgt7ccaacgcctgagaat 
aa E ggg ^ cagcggtcgtact ^ccgacagcgx3cagcagcggtaggcccg 

Gov.^w^CCATGACTCTTCAGTGACAGTTrTTCTTCAAACGCCGCSCCrT'G 
^iZ^^ A ^SA:"£^ AGTCGT ^ G ^ CG ^^CGGGAO^CAGGCAAT 

-« : ^^J^ctctggccttgagaacccgaccttggggccctt 

^^-^^^^^^C^CACCTCGGCATTGAGGGCGGCTTCCTCGG 
r5 GG ^^i G 5S G £ CCGCCTCTGAGTGCG CCTGTGAGTGCGCCTCCGAGTG 



i J,J> - - ^^^^CTTAGCATAAAGTATTCAAGGTTCATCCATGTG 
: ~2j£ A G ~ 37CAG ^ T IE£CTTrCCrrrrAAGGGGGAATAGCATTTCGTT 
^. ^ -T^AGATGCCACATTTTGCTTCTTGGTCCATCCCTCTCCGGACACTT 

AAACTGCCTTAgTgPTrTr&r'riftTnaT^t o wii^^x. . ~~— - -- 




AGGCCTAGTTTCTCCTAGTTACTAGGAGATCGCAGGCCTCGTTGTCCTGA 

^AAAATGCTTTTGGCrCCTTGTGTATGAAGGTTCCTCTTCCATGGATGCT 
GGAGAAGTCGTGTGTGQAOGGGCAGTCATATCTGGGCACCTGTTGGCCAG 
3TTCAGCTTACCAGTTGGGTACTCAGCAGGGCATGAAGCCACTGCAGCAG 
C C CTTCT CTTTAGCCGTAAATAGGGAGTTTGGAAGAGAGC CAGGGTTTCT 
GGATTTATGCATTTTGATATTTTCAATAGTGTATTAAATGTTTAAAATAG 
^AAAACTGATCATTATTTTTGTTAATGACTGAGAAAGGGACTCCTTCACC 
AACAGTTTCAGAAAAGTGAAGGCGGTTTTGTTTTGGTCTTTGTAGAATCT 
AGGTGGTTGAATGCATGTCAGTTGTAGAAGTCACCTTGCCTGATATCCCA 
CGCAGTGCTGGAGTATTCCACAGACCCCATGTAGGTACTGCACCTTTGCA 
GGTATACTGCTGGTGTTGGTGAGCTGCCTTACCTGTCCTGTTATTGGAGA 
C C C GTGCTTATTAGGAAACTTAAAATGAACTCAAATGAGCTTCCTTGCTT 
ACTGGTC2TAGTCCTTTGGAGCAACATAGGCCAGTTCTGCCTCGTTTTT7 
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7C^7C::T7TGGG7A777GACGG7C7A7777G7AGGACACAAAA7G7GGG 

AAAA7AGCTAGGCAGG777AAAAAT7C7CAAC7C7ACCAAGCA7GG7GGC 

""7A7G7C7 G7AA7CAA7C CCAGCAC777G7GAAGC7GAGGCAAGAGGA77 

3C77GAGC C7AGGAG777GAGACCAGAC7GGGCAACA7AGCAAGAC C7CG 

^rrcrTAAAAAAAAAAAAAAAATTACAAAAATTAACCAGGCATGGTGGCA 

CACACCTG7AG7CCC77C7AC7CAGGAGGC7GAGG7GGGAGGA7CAC77G 

A(3CCZAAAAGTTGAAGGATGCAGTGCACTGTGGTCATGCCACCGCACTCC 

AGCATGGGAGGCAGAGCAAGAC C C7G7C7CCAAA7AAA7ACA7AAA77AA 

A77C7TAAC7CA77CA7CAAAG7A7CCAC7G7AGC777CCA7CA7CC7GG 

^G77GT7777777AGAAGGA7C7GGC7CCA77GCCCGGC7AGAG7GCAG7 

GGCA7GA7C7CAGC7CAC7GCAGCCCCCACC7C7C7GGC77AAGCGA7CA 

C CCAC77 CAGTCAC CCATCTGGGTAATTTTTGTATTTTTTGTAGAGATGG 

GG7777GCCA7G77GCCCCAGG77GG7C77GAAC7CC7GGC7CAAGCGA7 

CCATCT GC CT C CATCTCCTAAAGTGTTGGGATTACAGGTGTGAGCCACCA 

CACCAGGACAA7CC7GG7GGC7777AACGG7777CCA77GC7C7CAGGC7 

AATGACCTATAAGCCCCTGCGGGCTTGGCCTTT7ACTCCCT€AGCATTAG 

CCACC7CCC77AGCC77AGCCCACAC7AC7C7CCCC77GC7CAG7G77A7 

CCAGACAC777G777777CC777CCA7AC7CC7C7C7G7C7GGGAA7CCA 

ACC7-7C7T7C7CA777C7C7AG77GA77A77A77A77777AC7C7AGCA 

GCC77A77GAGA7A777ACA7ACCG7ACGAT7C7C C CAC77ACAG7G7AC 

AA77 CAA7777 C7AACA7777 CA7CAC CC C C7AAAGAAAC CC7ATAC7CA 

^AGCAG7CAC7CCCCA77C7CCCC7CC7C7CAGCCCC7AGAAACCA7GA 

A7C7AC7A7CCA7C7C7A7AGA777GCC77C7GGACA77TCA7A7G7A7G 

AAA77A7GC AA777G7GG7 C7 C7GA7GGG C77 C7777G77ACCAAAA7A7 

CA7GGG777GA7C7AGG7CC7GC7GC7CGC7GCACAGAAAGCCAGCCAC7 

GAGA7GACAAG7A77GCCAAGGAAGAAGGC777AG7CAGG7GC7GCAGC7 

GAGGAGA7GGGGGC7CAATCTCAAA7CCA7C7CGC7GACCTAAAACCAGG 

GG777GGATAGCAGGGAAGAAA7G7AACAA7GCG7AAGAAAACAGGAACC 

AGGGAGGGGCAAGGAAGCAA7CC7GA7GAA7GAG7GG7CCAAAG7C7CA7 

7GCC7GGA7G7GG7GA7C7GGCGAG777CAG77C777GA7AC77rr777G 

AGAGGCC7GAAG7C7777CCCCAGGAAGGAAC7CAAACAAAACAAA7ACA 

AGC77 CCAGC777AAGACCAGAAGCG7CAA777C7A7G777A7 CCGAAAG 

AACAG7C7A7GGGAC7A77GG77AAG777CAC777CAC7T AG7A7 GC7G7 

777 CAAGG777A7 C CACA7AGCA7G7G7 CAG7AC77 CA77C777TA7GAC 

7GGG7A77C7A77G7GCGGA7A7ACAA7A7777A777GCCA77CA7CAG7 

~GA7GGACA7C7AGG77C777CCAC777T7GGC7A77A7GAA7AA7GC7G 

— ATGAAC777CA7G7A7AAG77777G7G7AGACA7A7G7777CAACAC7 

"A7GGG7A7A7ACC7AA7GAGAGGAA77AC7G7G7CA7ACGA7AA77C7A 

Z AACCA777GAGGAAC7GCCAGAC7G7777CCAAAGCAGC7GCAGC 

A7777ACA77 CC7ACCAGCAG7G7A7GAAAG77C CAG777C777ACA7CC 
'CAACAACAC77G77A77G7CCA7C7777AAA77ACAACCA7CC7AG7GG 
7^G7GAAA7GG7A7CACA77G7GG77777A777G7A777CC77GA7GAC7 
AA7GA7G77AAGCA7Cr777TA7G7G777AC7GGCCA777G7A7A7C7C7 
A77CAGAG7C777GCCAAT7777AAA77GGG7CAG77G7C77C77CC777 
777777GAGA7GGAGCC7CAC7C7G777CCCAGC7GGAA7ACAG7GG7G7 
GA7C7CAGC7CAC7GCAAC7TCCACC7CC7G7G77CAAG7GA77C7GG7G 
CC7CAGCC7CCCAAG7AGC7GGGA77ACACGCACC7GCCACCA77CCCAG 
C7AA7777777C77TGTAT7T7GAG7AGAGACGGGG777CACCA7G77GG 
CCAGGC7AG7C7C777G77GAC7C77AACCA7CC77CAG7C7CAGACAAA 
ACA7CC C777C7CAAGGA77G7GA77AGC77GA77A777GC77A7C777C 
"CCC7GC7AG7C7G7AAAC7GAGGG7AGGCCAC7A7A77CA77G77C77G 
GCACCAAA7AGAAAC7AAA77AA7G7C7777GAA7GAA7AGGGC777C7C 
C7777AAAGA7C CC77CAA7ACAG7AACCACAC7A7A7A7AAG7AGCCAC 
AAGC C CA77CAA7AA7AC7AC7AG7NC77GCGC CAAACC 

GGC7CAGCG77AC7A7AC7GG7C7CAAAC7CC7GGGC7CAAGCGA7C7GC 
r CCCC^CGGC77CCCAAAG7G77GGGA77A7AGGCG7GAGCCACGG7GCC 
7GGC C7 CAAA7AAC7A777AAG7GAAACAAAAC7AG7A7GGCAC7AA7GA 
AAAA7 G7 A7AAA7 C CA7AA7CGCAGAGGGA777CAAC77AC77 C777CGA 
TTA7G7AAAGG7CAAACAGACAAAAGACAA7GACAAAAC77AA7GCAA7o 
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-■M.ALn.ACnui i . j, GAGTTTATGCAGAACATTTACAAAAATT^AGTG 
GAAGCCTAAATTATAAAAAGTTGCTGTCACGTAGAATAA^^AACC- 

_ . vjAG i w — v^AATTCAAAGCCCTCCACACTCTCCTCTACC GCATC~~ 

TATCCTCCACCACACTGCAGTGCATACTCTGGGCTACTACTCACTGTTC'" 

-^^^:;SI^^ CTGCTA ^GAATTCTTGTCTTAGCACC^ 

- A £;rr™^ G ;^ GTT ^ 

5 A I- A S A ^--^TTCTACTTCCATTATATTATTGATCACATCTGTGC 
i^^SAi^g-^^^^^^^^^^^ACTCTTTTCTAGACCCTGATAG 
ZZ—^^^t^Jit: G 2 G ^^^^^^TTGCTCATTCCTCATGGCA 




~rir*rr~rrllZ™Z.* 4 ^ i TCTTTCTATTGGAACCA 

GGAACCTTCCCCTGAGGGCCAAAAAAGAGCAGGGAAAGGTGCAAAGACAA 
AATCTTCCAT7TTTAAACAATGTAAGAATGTGG7CCACCTCACGCTCAGG 

- GATG^AGACTCTTGCTAATrtamrra ir&rrw^ * « . ~ — — 



. . . TAGAATATGACG 

^ aAAAAaC - ACvj7GTGGAAAAATGCTTAT7CTTCTGTCC~C~ATAAATGT 

-tt^; AG ^ TGTC ^ CAAAATGAAATC ^ GTGAT ^AA^ 

-AAGAAGA7AAA7GGAGCAAAGCAC7CTG7G777CACAG7G77GGAAA7C 

. TAAAA TAAGAGTTAATAACATCAATTTACATT7TTAAArt&rarTTrrrr 
AT H 7TTAGACT ^ TGGTTGGA ^ GCTGG TAGGTG7AC^TTrGTGGAG 
AGT S CTGTTTTTGTCTGTCG ^ G ^ G ACGTATTTCAAAGCCATATC: 




; «Xzz;r^^ i^^^uii « actgaaaggctggggggaaatgacgag 

ACCCTGAGATAAGGCAGTAGTGGTGCC3AACAGAGTGGAAGGGAGGTAGTT 
oAGATATCTTCAGACT^fiaATrn^a iTmn^^^t » » 




- w4-fcr* w www a a noauvm i An i i I I LTAGGT 

T a ^IE CACTGAAGTTACATTA " CAA CACCACAAGGAAACTAGGGGAATG 
.n.uAAGvjv.ATACTGGTTTGCTTTGGAG7GGAAGGGCAGTGATGTAAGAGGA 
-TTAATGAGTTAAAGTTTGGATATGCCTGAACTTCAATTTGATATGTGCA 
rCTGATATACCCTTGGGGTGACCCTCCAGGCAATGGTTGAACATGTGTAT 




Zr. Z^^ZZZZ.~. -2 wwtl * A AWV " At-AGTACCCTATGACTGGAG 

CAAGTwTCTGTCAGTGAAATGAGGATAAGAAGAATCTTGACCTTGTGGAA 

Z A Z2J/ G77AGGAA7A ^ A ^ G ^ GA ^ G ^^AACATAGGATACTTCCTACAGG 

^ CT ^ CATGTAGTAAGGGCT ^ ATAAA TGCT7GATAAATATTATTGT^ 

TAATTTATTTCCAAAGTAAGATGCCACTGGAGGAATCTTTGGAACCCAAA 

TTAATAACAAATAGGACTGGATGCAATGGCTCACACCTGTAATCCCAGCA 

CTTTGGAAGGCCAAGGCAGGAGGATCTCTTGAGCCCAGAAATTCAAGACC 

AGCC7GGG7GACACAGGGAGACCT7G7A7C7A7GAAGAA77AAAAAAAA7 

TAACCAGATGTGGTGGTGCACGCCTATAGTCCCTGCTGCTTGAGAGGCTG 

AGGTGGGAGGATTGCTTGAGCCCATGAGGTTGAGGCTGCAGTGAGCCATA 

ATTGTGCCACCACACTCCAGACTGGGTGACAGAGTGAGACCC'ATCTCAA 

ATAAATAAATAAATAAATAAATAAATAAGTACAAACCAGCAAACACTAAT 
CCTTTCTAGAGATTATTGAACTCTGGAflfyv'a r: & r rTY*& iT^iwiw 




ww»«« * uwti. i. vj\jv_ v. v_ x vaVjAS_A(jUAWUA(3GCAAXCjQG 

GTTGCTAGAGAGGTAATGGGGTTGAACAGGGTTTAAGCCATGAGGTCTCA 
AGAATCCGTGAAGACTCAGACTA A II 1 1 1 1 1 1 1 1 1 i ' T GCATGAGGATTAG 



- — AVJWliUAVJUAi 

oTGTTCu . AGGAATTTCAATGAGAGCAGGGTTAATGAAGGAATGCAGGGT 
AGGAGAGCTGAGGGAAGGCATCTGAGAGAGCCTGGCTTATGAATGGCTGC 
C-TCAGTATGGCTCACCTGCTTTCCTTGTATCTACTTAGCAGATGATCCCA 
:CCCAGGCCTCCAGGGCCAAGG7CATTTCCACATAGTCATGGGCCCTTGA 
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~SGC ' — j GAG C AG7G7A**oG AAGACAGAG7C77AAGAAAT7GCA7TAAC . 
--CA7GG7GC77GGCAAG7G7CG7CA7CC7A7GCCAAGCC7GATC7GAAG 
-.GG7GCATGC7CATAGG7AGC7GC7GCCCAAGAT7ACAGCAGC7TCT7CA 
I TC —AGA7CCATGC7C7CC7A7A77CATT7TrCCAGGGG77CC7G7CC7 
-rGACAGTGA7GAGA7GCAGAA7GAC77A77GAG77A77C7CC7GAXAG7 

-'"CCAAC r 7CCAAA7GACAA7GGGG.CA7GGAGC77GAGAG7GGAAA7G 

AGGC-TAGGGA7AGCG7GC77AGGAAAACAC7CCCAGCCTGATG7AA77 
—GGGGG7ACAA7GGCA7777CA7CA7CAAGAC7GA7G7AAAGGG7GAC7 
AGCAG7C-AG77GGGGGTGAC7CGCAC7GGGGC7AGG7T7C7GA77C7GCC 
TAA7C-AGACAGAGCAGAAGCAC7AG7GGGC7GG7AGAGGGCC7CCAGGG 
;^ T ^ AC ^^ XGXCCTGGWVAAAC AGC7CCAGA7TG77GG77CACG77C7 
3AGGACAAGC77GGG7AC7ACAGGATAGAGAGAG7GG7GGGAGA7GCCG7 
.„ cc ^^ c - CTGC7aATGCCX GCCC7GCCA77CC7GCG7G7GA7G7C7CXG 
GGGCA7C GCCTTCCCTGCCCAGACC7G7AG7TCAGC7GAGGGCA7G7T 



GAGGCCAAATGGCT7CT7AGAGTGTrAC7TTCC7TGAACAGCTC7GC7GG 
GAGAAC~GGAGGAGC7AGC7AGTCACGG7AAC7GCAGCAG7CAAAGGA7C 

. • »/^r^T»r!X/^aaiflifJAaf"J^TAT3VGCG77 




TCC7GG AGATGTU I i Vj l uviAunuuiw * . . 

r*"^ G AG C C C7C CAGGAAACCT7GGAA7 A77AGG7T AG7CA7C CCCAAGG 
AAG7C7AAGAA77C7GG7C7CACCCA7C7CC77TAAXTCCCACAA7GATC 
-7ACA7GA7A77AAGGAACACGGGCCAGTAACCCTGCAAGCAATGGA7G. 




-ACATAA7-GACACA7AAAAA77G7A7A7G7C7ACGG7G7ACAATG7GA7 

: 7T-^GA7C"A7GTATACA77G7GAAA7GA7TACAACAAGC7AAA7AACA 

~ AC C ~ A77 C AXCG7G77TCAAAGGAA77 AAAC7CAAGCACAAAAG AGAGvr 

: GC7G77GAAGAG7AGGGC7GC7C7ATC7AAG7AG7A7G7CTGG^. ; ^ 

CC7GGA7CAGGG7CCTT77G7GCTAG7AATAAACCAGCCC77C7GGGGC 

GC7CCAC7T7CCC C ACATT77C77CTGG AGC CT CC CTAAGAATT AGGACA 

TGGCCAC7TrCTC7GCA7AGGC77CCTAC7TCAACAAGGACAGGGCT 




\rAG~AAGAAA77AAACAG7GCAAAAGAA77CACTG7GAAAAG7AAAA7G 
~ x ^~~^c7GAGAGACAGA7ATTGCAGC C CAGA7AC7AC7GGGG7C 
AA7AG-~-C^7AAGCA7GCCAX777GA7GGT77A7GGGAC77ACAG^- 
~AAGAAGC77GACAC7AGGG77GA7CTCAGAAAATCAT7G77GCAGu7aa 




-7AGcack3agcccacttaaovtgtcctaack:cacctgga^tcc 



iGCAGCAGGTAGAC i 1 OhhUhu\uAw\wvnww^ i?^ZE^ 

GATT G7G77TrrAAGGAC7T7TAAC7GGGGAGCCCTCCGGGACAGATCA 

GA7GAGAG7GAAA7GTGC7CCGCCTTAGCC 

:S°^ii^^ TTC - GTAAAAG GGAGAG7GG7777A777A77Tr7AAAC 
^"kr~-AAGC-GC^AAAG7A7A7GA7A7G7A7AGA7AGAG7ATAAX7AAA 

"AC ~^AC7ACAGACAAAA7CAGGAGAATGG 

CAAA7GGG7AAXGGCAGCA77GGG77GCGCCCACCCACC^GAAGGCAGAC 
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;^I-* TOTACTTTG GTAAATAGCACCCCAGGCTGTGTCACTGCCAA 
GAACTTTCZACAGCTCTTGGAATAAAATTCAAAATAT^TCCAA^cara 

, TCTTT . C * T i TTTTT I ! 1 1 TTTTTTT'rTTrGAAACAGTTTTGCTr^rTT 

-^i^^iiS^^^ApCTTGTGCCTCAGCCTCCCAAGTAGCTGCAATT 
^^^:^rA»»^^^ CCCTOCTAAT " TTGTAT T^AGTAGAGAT 
o^jGw, 4 ^ACAATGTTGGCTAAACCGGTCTCAAACTCCTGACC^CAAGTG 
^r-^E A 5HS^S^E^^IGCTGGGATTACAGGCG7aAACCAT 
^^^ZZr^^^^J^^^^TCCAGACTCATTCCTGCCTTAA 
GG^ C. . ^.TTCAGTCCTCCCCTCAAATGACACCTCCATGAAGACGCA 
^—^SSTEJ^^^^^^^^A^^AGTCAATGTGTTGGTTTTCTGTC 
^ir: A t:^ GToTAAGCTCTAT ^ TC ^GAAACCrTGGCAGTCCAG 
T^S^^S^^^J^^^^^^^^^^^^T^^CACAGTA^ 

TTAAGTCTGACTATTATCAAGCTATAAGATGGTTGCAACACTATCACTAA. 
TTCCCTCATGCTCCTTGGTAGTCAG7CTCACCCCTAACGCCCCCCTCCTG 

gcaatcactgatccgttttttgtctttatagttttggtoS^ 




~ ZH. >- wriA ou^ i a i I IATTATTTCAGCCC CAAATACTGCC 

" G r2i wv " * ^ ctc ^ tg "cctagcattttccctatctccttagcc w 

CCTTwTAATCATAAACGAAGAGTGAACTTTCCCTTTCTAAAGGCAACTTA 

: ~" u ^ Av - AATrAGCT CCAGAAAGGTAAAGACTCACTGTGTGCTCATC 
?p;"I2:S^r: A S? G S CTGG ^^ CTG ^ G GTGCTGAAOUA^ 
AGAA^AG , GAATGAATCTCTCCCTCTCTAGACTCCTTCTCTTTTGTAAT 




TCAAGGTCTCT GCAC TCACrTCrCTrTCTGcfcaG^TTCTTm 

TPT^FE^igTT^^ ^^^ ^TCTCAGCTC^GACATTGCTGTTG 

AA ^~^^^^^^^^^^^^ A ^^ AAAG ^ G C^^CCAGCCCTACT 

S7 G ^ G ^? G ^ TAG ^ C ^^^ AA ^ A AGTTGTTTACTTACTGTCTCTTCAG 

GAGGGCAAGGAGCTTTCTGGTGGTGTTCAGGGCTGTACCAAGCTGTACCT 

rGC-CACCCTGCTACACTTTTTAGCAACCATCTAATTT^S^ 

~ GC I: r ; ^r TGGGAGGC TCACCCAC77CATGAGACTACATTTGGTCCTG 
v»G7AGAAAG * G * ACAAAATCCACTGGC7CAG7TTTAATCAA7GTA7G7TA 
A7ATTAACCAACCTGAGA7CTTGATT7CCACGCC7GGCTAATm 



1 * * J ~ UA ^ UA ^ i ^^^TGAGC77ACTGGCCAAAACTTGG 
•^ CT ^ GACT ^ CGGAAG ^ CTG 2TTCTTCCC7GGTAACTTTTC7GA 
AC . t AACCAC7GG777GC7TGACAAGAGA7TACCATCTTCTCACTTCC , A 
3C7A7G7GAAC7CAC77A7C7GC7C7A77GC7G77CAG7C7AGCACGGCA 
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— TATTGAACGAGTGTCrACATCTGCACCCCCTACTTCTTACTCATCCAT 

--7GTTTCAATT7CTTAAAAAGAAAAAAAAAAAGC7ATTGTAAACATACG 

ATTACAGAAAA7GATTrATAACATGTGTATG7ACCACC7AGCCC73TCAA 

37C-~AATAT77G7TATAT7TGC7TCAAATC7T77T7CAGACTGTAG7TA 

AAAATTACTTAGGAGCCATTATTTATGGCCTATTTCCTGACCTAGTCTTC 

T7GA7GG7CAA777GCC7AA7CA7C77AAG7TGCAAAAGC77AGAAT7AA 

AGCAAAGTACCTTCGATCCTCTGCTGTTGCCTTCTTTTTAATATTTGGGT 

TG777GGG7 C Z CA7T7ACGG77G7GAC ATCAG C773A G777TGGGAG»- . 

G7C~' ,, G77CAGAAAA7GG7TCTGGGGAACAGCC77777CAAC77GGAG7C 

-AAAGTC-TGTGCTTTTTGCTGAAAGCCATTATTGTTATGTTTATTACCAC 

-GG— CCATTTGGTCTTATGCTAGGGGTGCTTGGAATGGCTGAATTAAAT 

C""GCCAACTG7CAAATTAGGCCTCTGGCTTACGGCTTTTGACTTTTGCAG 

^ACACATGATGTCTGAGGTATACAAACTTGGCTGGACTTCTGATCTTGCT 

TGATG777GGATGTCTGTTGTTATAT7CACCC7GAAGCAAACTGGGGTAT 

GTTC r GGGTTTGGTGTGCTTCACTCTCTGTTCAGTAACAGGGTATGACCj 

TAT C~T AGT77C AT77GGT CTTTCATATTGACT C Cf ATTAACCTTTATAT 

CT77GATG7TC77GACTACTGGTTTC77TGA7GAC7GAACT77AC7AAGG 

GTCCGAATAAAGTGAGAGGGAACCGTCCTTGAGGGTTTTACTCCTGGTCT 

TGCAAGATCTGCTCCTCTAGAGAGTTGCTGTGATTTTACTGGGAAAGTCC 

TGCT"" 1 " G 7G777C7CCAACAAATTG77T A77AAC C C7 AT C777CAGAACA 
GCAC-A"-AACTGAACTrTTGCCCAAGGCTTGTTTAGGAACTAAACTGTT 
^GGTTTGATTATAAGAGTCAGTCTTTGGCTTACTTGTGGTATATAAT. 
-AGGA7C7GGC7TCC7C7CAGG77C7GTTAAGATATC7AGCAAGw^. 
-t^7~~G~~7C77T7AGAAAG77ATCCAAAGA77CG7T7TCAACATGGA.l 
^ATf'ATAAAGTCTATACATTTACCATTTCCTTGATCTGTTA^GC. 



GACAG7C CT C CTGTGGTGGACTATCTAGCTT CACACTvj i 1 w^^'-;; 
GCTGAAAAGCrrAGACTATGGGTTAGAAGAAACACATTTTGAAGTCCGCC 

TTTTGCCCAGAAGTTTTGGTGGCTCTAACTTCAGCTTCTGGGACCCTGCA 

GTATTAGGTGGTCTGGGCTGGAGTTTAATGCTGATGGACCTTTTAGGTTT 

GACAGGCAAAACAACATGGTTGGTAACATCATTTTTGGGTCTAATAGTCT 

Iaaa^caaagaaaatacatattaaaaaatccttaacatatctta^ 

Sttaaa^taataactgtgtttaac^catgctaaaaaaaa^ 
agaat~tcatctaagaaagttgaatcctcagaaagtaaagaaagactcac 

TAATAGGTAGTTTrTGTGTrTrTTTTTTTTrTTTTTTTTGAGACA 

TTGCTCTGTCACCCAGTCTGGTGTGCAGTGATGCAATCTTG^ 

AACC-CTGCCTCCTGGGrTGAAGCAATTCTCCCACCCCAACCTCGCAAG. 

GGCTGGACTACAGGCGCATGTCACTACACCTGGCTACTJTTJTuTAT^. 

-AGTAAAGT-GGGGTTTCACCATATTGGCCAGGTTGGTC'TTGAAATCC.o 

^CCTC^AGTGATCCACGCACCTTGGCCTCCCAAAGTGCTGGGATAACAGG 

TATGAGCCACCACACCTGTCCTAACAGGTAGTTTTTACAACTTGAGTT 

TATCAGAAGTATATTAGAATCTTTTAGCTTGACAGAATTAAGCAGAGATG 

Sgtgaatatacaw^cttgctctttcaaaaatg^ 

T AGTTGTTGAATGCCTATTATATCCTAAGTGCCCrCCAAAGAACCCTGAA 
AAAATACATACATAATGAACTTATGTTAGGGTACCTCCCAACAAATCTCT 
^CTAGTACTTTGTATAGCCACACTATATGTTTTTTAAACCACTGCCTTTG 
--.- A _.__: i:Z.i.ZZ.* .- nT . r - rr . TrT raTrrcTGGAGATCAG 



GGAATGCTAGAGTTACCACTCTATTTTTGATTTGACAAAT^ 
^T^CTACATAAACTTGGCGAGTATGTAAAGGATTrCTAACCAGAACCAT 

:~rT-AC-AGGAGTCa^CAGCGGAGT7GGGAG7GGGAGGCAGAGAGA7 

GGATGGTATAC^C^CA^TGGCTATATCTGGAT^AATC 
ATTTATATACACCTCGGATCTCTCCATCATTGCTTACTGAAGAGGxG«AG 
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3GACG * - GGCATGAAAGu iTCCAAATGTGTTTTTTTAGTTGC'^^^A i 

nG x^AG x GCACACTTw . GTGGATTTTAGTATAGCCACACTATTATACAGC 
^CH^GTCTAAT^ 

TTGGGTTTACTTGTTGGCAGTCCCTCCCCA 
>Concig38 



AATGA 




'^G~^G^^IarTn^J ^^^^^^^^^^ ^^^^ 

— . w -~GCAAAATGGGGATTACACTACCT3CCTCACAGAGATGTAAAC 
TTATGACATGTTATriTrca -wr*r>n* ™;^Z2r^r^_- "7^^ 



CTGTGGCCAAAAACTAAATCTCATTTCTTTCACGTAAACTTCAGCAATTC 
AATAATAGTACAGTCATTTTATGTT7CAACTGAACCAAGTCAGGGTTCCA 
S;;^ G 5 C IH G H C ^ TCTGCTCT ^ G ^^ TC ^GAAGTGGAGGC^TC 



- AGC - * - - 'J«ATCTTCTTTAAACTCTCCTGGGGCCACAGGGGACTC"G7T 
TGTGTCTGTGCCCACATAGGATGATTCTGCCCAGACCTTTGCTGCCATTT 
E^5^J2Z^7 G ^J G '^ , ^^A G ^^^'^^G^C^3CAGTTTCCTTGGG 

TATCTGGGTTTTATCGTCCACCCATCCCAGAGCTCAGTCTAGAGGAGGGG 
GCAGCCTTCGGGTTCTCTCCTTCCTCCCAQAGrrr rTrrrrrr^s rv» r* 



T^Tl'Tt^ 1 7 ! xv - 1 " i A,J, * AAA ^^GTCTGGTTCTTGAATATAG 
AGTTGCAGGTTTGAGGGGTGTAGGCTGAGGTAAGGCAAACTATCACATGG 
AATAAAAATTACCCTGTGTCAAGGAACAACCAGAGCTGGACAGTTTTTAA 
ATGTGAAAACCAATTTTATTCAGGACTATGGCGAGAGGTGAAGTAAGACC 
TCAGTATAGAACTGGGCTCAATTCCGAATGCAGCATGGGCAAATGGGAAT 
GTATAGCCTAGGAGCAGGGTGGGAACCTGTGGATGAAGAATTACTAAAAG 
GGCATATCAGGGG7GAGGGGGCGTCCTGGCTACACCCACTAACTACTGTT 
3CTGAAGAAAGGCCTGGTGACATCACTGGGGAATGGTGGGGGATGAAGAA 
rrCAATCAGATGGATATTGAGGATAAGGGGATCTTGATAAACTGGC'TAG 
GAGGGTTTrTGCTAAAACTGGTTTTCATAGGTAAGTCCACAGACAGGTC™ 
rGGAGAAAGTTCAGGGACCTACGGTTTGTTCGGGCAGATGCTTTGTCATC 
TGTCACACTGGCACTGTCACCTGGCTTTCCTTTAGTCCCTCCCCCCCTTT 
TTTTTTTCTGGAGTAGTTTTGGGAGACCAGAGGAGCAGGGAGTTAGGGAG 




iiwiv.t^iiuii lA^iii^TTTAATTTATTTATCTGGGACAAGGTC 
TCACTCTGTTGCCCAGGCTGAAGTGCAGTGGtGTGATCTCGGCTCACTGC 
AGCCrCGACCTCAGGGCTCAAGCAGTTrrGCCACCTCAGCCTCCTGAGTA 
GCTCGGGCTCACAGGTGTGCACTACCATGCCCGGGTAATTTTTGGGTTTT 
1 1 1 i . 1 1"1T1 1111111111111 TTrrTGTAGAGATGAGGTTTCGCCATG 

rTGCCCAGGCTTGGTCTCGAACTCCTAAGTGATCCATCCACGTCGACCTC 
CCAAAGTGCTGaGiTTarar5<7/ , aTni«/'^»/-^«^~T>^/»/»<^/»^.>.*.f«« • . -.— . - 



— - - * i ••'«w««w*v*W(^*iWlvSTAAGCAAAGCAA 

AGCAAGGCrCCCAGGTGCTGAGAAACAATGGCTTrCTGGGGAAGCGTCTG 
TGTTCAGAACCTTAAGTTGGAAACATCTCTGAAGATGTTTGCCATGAAGG 
TTTTCTTCTGAAGTTGAGTCTTTCATCACTAGGTAGGCGTGTTTTGGAGT 
CTC7ATCAAACAGATCC7GTGTTTATTAGGAAGCTGTGGTTCATAAAGCC 
2CATGCTAATTTTGCAGGTAGCAGGG7GGCCCTGGCCTGACCCGGGGACA 
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3AG7GGC737CC7CCC. _CAGGCAGGAAAC7C7C7CC7GCCACC7AG7;~.j 
~ C CAT AC CCACATTTCAAGGGAGC77 CTGGGTGGTGAGT7TAC CAGACT 
A7GG7C73AGG7AGAG77AAGCAAAACAAAAC7AAAC7GCA7AAAGAAAC 
AGAAAGAAAATCAGG7G77A7AAAAACAA777GGCA777G777G7G777C 
AGC7CCG7G7CGA777A77GC77CCACAAA7AG7GCCGA7A7GCACCAGG 
'cAC7G77G7AAAAC7GAAAA7A7G77777GGA7G7GCCCAG7C7G7GAG7 
AT7AAAC3A7GG77GA777GAAA777GC7A7GA77CA7A777C7GGGGG7 
AA-GA7GCAGGA777C777GGGGGGCC7ACGA7G7GGCAT7C7AGAA77C7 
CAAAGAATCAACCC7GG7GGGACCAGGAAGAGC7GAGC7GAGGCC7C7C7 




G7C7CACv-.r»™™— — ~ — 

C"ATT*~CTGAGAAGTTAATGGTAACAATGGCATACATACCTACTTTACAG 

^TGAAATTGGAAACCACAGCATTAAGTGTTTCCAATGAAATTTGGCAATT 

TGGGAG777TC7GAGC7GCA77GGA7G7GG7T7TGCA7GC7G77AGGA7G 




C AC"" " C CACTCATGGAATTAGAAAii*- i 1 auaul :auai. a - - 

A.CC*^"~^CAAGGTGCTCCCGGGCAAATCACTTAAGATCCATCTrCTTCTC 

CTC CTGCTCCTTCTCCTCCTTCTGAGTl " l"l 'III T TTTCTTTCCAAAAT7C 
AAATGACACGGTACTGGTAGAAGAAAAGGTCCAAGTCTGCTTTTACAGCT 

C C C C ^ ATC C C C AAATGTACTC CGACC C CAAGATG ACCATGTT ATCATTT 
GAT-GACATCCTTCTAGTTTCAACTCATTTCTTTGCATGTATATGCACGT 
ACATATACAC7A7777A7777GCCAGGGGTCACCGTT7AGC7GCATTAAT 

— ' A7AAAA7AA7C7A7A777AC77A7GG777ACG7AAAACAACA7AC 

ACA7G~AAG7G7A7AGC77GA7AAG7C77CAC7G7AAACCAAAAA7AAAA 

~7CG AAGC CC CC C CAACCGT CTGAATGG AC C CCTCTT CTTGGC CAAGAGC 
ATTCCAAAGTTAACCTGAAAAAACTAGTTCAGGTCATGATGGAAGGGAAG 
oTTnr.&riTGCCCC^GTATACCCTTCTCCCTTTTGGAATTCAGGAAAAG^ 



^TT--CATCACTCCTCATCTACCCCCAAATTTCCTTATGCGTCTTTGCA 
37 CAACC7 C C CACCCCATC CCCAGGCAACTGCAGATCTACTTTTTGTC i>. 

-GCAC -AACTGACCCTTTCTG7GATTTCATATGAATGGAATCATGCG 

-7GAGCAG7C7777G7G7CTGGC77C7TTrGC7CAGCA7AA7G77777GA 
GG777G7CCA7G7TT7TGTGTTTG7CAATGGT7AA7TrC7C7CCA77GCA 
GAG7AG7777C7A7TG7A(^TG7G7ACCACAA777G7A7A7CCA77CCA7 
^C7GATGGACAT77GA777G777CCAGA7777GGCAAT7ATGAA7AGAG 
C7ACCA7GAACACCCAGGTACAAG7CTTTG7G7(^C77A7Grr77CA77 
~CTC77GGAA7GGAACTGTCATATCAATAAG7ATATGTITAAC777GTAA 
GAAAC7GACAACAAA7TATCTGCGA7GGT7KTGCCAT^^ 




'C777 AA77TGGACATiri JUUi i vjvj i u j. nuwn«Mw» » >. - — - - - - 

C* T 'AGT^"*^C777GCCC7GA7GACCAA7GG7G77GAACA7L i i 1 1CA7G7G 
C77TTrGACCATTTACATATCCTC7TT7GTGAAGTGTCTGTTCAAA7ATT 
^TTGCCCA777AAAACA7TrGGGGGTr7G7CTTATTATTGTG7TGGGAGA 
c5?ST^T5A^A™77GAGATGGAG7CrCAC7C7G77GCCCAGG 

CtIgStg5g7gS^^ 

^C^GCAA77C7CCTGCCTTAGCC7CCTGAGTAGCrC^T7A^G^ 
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IZZ^GCCTTGTCATTCTAATTTGTGACACATAATT 
^rr2:^J^£I^ G ---^CACTGGCCTGGGAAACCAAGGGA 
-^:2^S*"^?^ COT ^TTTTGCAATAArrGTG7G7TTr:CTGCA 
S" " * X^ C ^ CAAATT CATGGAAGCAAAACATGGAAGCAAAG7ACC 

:^^--^:-^J; A ^AAAAGCAATTTAAGAGTTTArrGAGAAAGTGA 
Zi^Z^^^^CATAGAGAGAAAArrCTACCACCCTCAGGTCC 

tccttgaagacattccatatattatgctagg^agtSc^^S2 

- GTAG x cttgctttgtcacccangctggattggagtgcagtggtatgatc 
tcagctcactgcaacctccacctcctgggttcaagcaattctcctgcctc 

> Ccr.c 1=3 9 

C GCT C GCATC - -TCATATCCATGAGTGTTCT GTGGGCC CT GC GAAA 

^t A X-i:r^;r w ^ C ^ GCC ^ CTCCC ' :r:,!:c -CCTGG7GC7GCCTGT 
^ G :rZ A ^ TATG ^ T ^ GTGCGAC ^«^CATTTGAAATGCAACCAG 

I a ^-E gg 5S gcgg !Iwgcctaccagt™ 



^T:£r G ii TGAA ^^ 

TTTGCACATGTAGCTTTAATTATTCTCATTaTra tw» t> <*» « «ti 




rZr^Si-r^^^^^^CTSACAAATACATATGTGAGAAGT 
Ci"--""'* *^ lAC: I GAAATTTGG AC7TTrCTAGAAAATCTGAATGTGA 
• iGu '-' 1 ' - CACATACCTTTCTGGGGATGATGArT CTTGTACTTTTATT 7 '''' 




AJCEUrC** i0i ^ A ^^ e ^CTTTGGGAGGCCAAGGTGAGTGGATT 

GC i -^ G ^ ^ g ?^?JTF gaga ^^ g ^ g ^ ggggaaga '^ gg, ^ aaa ^^^a 

. wa w i ACwAAAAATACAAAAAAAAAAAAAAiiaraarrii a k » » 




AAAAAAAAAAAATCAGATTGCTTTATTGCTGGTTTTCTTTCTAA 

—^^^I^^^^^^^^^^^CCCATTGGTTAATGGTTCCTCCTTT 

GTC7ATT3AATAAAATACAGATG7CTGCTTTTGGCAACATGGTTGAATGT 




" u ' Jmuw * i «awiv-«jv»ua^ rAAACACACGTCAAAGGCAGTCTTCCTC 
TCTGAGGGAAAACACTTGTATAAGCATTGCAATaVATGGGCCTCTTTAAT 
TATGTG CCAGTGGCAAGAGCGGGTGC7GAAC C CAGGGGCCTGCCTCAATC 
CGGGGCZTTTGAGGCAGAATAAAGTGGTCTCAGGTTGTTGGCATTTCCTT 
jCCCTTCCACCCGAAGCAGACACAAATCCTCTCTGGAGGCAAGTTCCCCA 
A H^ GC ~^ GTACAACTCC ^ CAGACTAAGATCAATC ATGTACAAGCTCA 
rAGACAAAGGTCACCAAACACACAGAGCAATAAACAAATTCATGAGTGAC 
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3TGAATGAGAATAAACA^.»AACAATAACCACCAGCTGGGATGCTCtAAG- 

~~-CAGCTC-TTAGAATTCCTGAATATAGAATAAAACTGCCACAATGGCAA 

ACATGCATCTAGTACTTACTGTGTGCTGGGTTGTAAGAATTTTGCACATT 

GT G C C AG AT AC CG ACT CAGCTT CACACT CACC CT C C7 ACTGTGC C CT CTT 

.^A-- "T-GCACTAGATTAAAAGGTAGAAAGGAAGAGGCAGCTATTCTGTTCT 

-GGCTGTGCCTCTGGCAGCACATGCAAAATGGGCAGTAACAGTGGCAGTC 

icAGGTAAGTAGCCTTCTCACAGTGTGGAGTTAAAGGCATGGGACTGAGA 

"GAGCAAGGTTCCTAAAGGGACAGTGGCCAGTAGATGACCAGGGGCTACT 

3GAGTGGCTGCATGGCTCTGTGGAAGCTCAGAGGAGCCTTGGGTCCTGCA 

GGTGCAGTAGCAGCTTTCTGTAGTTCCTGATCTCTGGGTCCCACAATCTT 

"'"CCGT*" r TTGCTCCTCCACTTCTAATTTTGTAACTGACTTCCCTGTGTG 

~ ACTT """CTGATTGAAATAGC CAGACTGGTTTCTG7TTC CTGATAA 

GACATTGTCTGGTACGAACACAGTAACTCATTTAATCCGATATCTCTATG 

AAG^G^TACAATAATTATTCCTATTTTACAGATGAGGAAACACAGCAGA 

GAAATAAAGTCAATTGTCTAAGGTTGCACATTTAGTCAAGGGAAGGGTTG 

ATATAACATATAATTATTTAGAAAACATCTAAGGAAATAAAAGGCATAAT 

T-AAAAATAAAACTAGGCAGGTTTAAAAAAATGAAGTAATCTATAAGTAA 

AAAAGTATAATTGTTGAAATACATATCTTAGTGGATGGGTTAAATAGCTG 

AAGAAATGATTAATGAACTGGAAGGTAGTTCTGAGGAAATCAGAATTCAG 

CATAGATAGAAAAAATGGGAATTTACAAAAGTACACAGGAATTATAAAAG 

AGGT-AAATTATAGGGAGGGTAGAATGAGAATTAACATTGGTCTAACTGG 

lATTT-GGAAGAAGAGAATAGAGAGAATGAACAAGGCAATATTTAAAGAG 

"TGGCTGAGAATTTTTCAGAACCAACACAAACTATGACTTTACCAGTAGA 

GAAAACAATGTACACTGAGGAGGATAAATAAATATACTATGAACAAATTG 

'AATAA-AATACTCAACAAAGACAAAGAGAAGATGTTAAAATCAGCAAAA 

AAAGAAAGTCAGACTTAGAAAGAAATGACAATGGCAGACTACTCAACAAC 

AACAATGGAATCCAAATTCGGTCAAACAGTATTTTCTTCATGCTAGCATA 

TAGC 

GGGAGTCCGCTATGCTCCTAAAGATTTGCACCTwTGATC 

T^TCTCTTITATTGCTTrATCCTACTCAACrAATrTTTTrAGTGCCTGT 

•T^rrTTTTTTTTTTAATGTGTGTTGATGACTACAATTCTAAACTCATTCTA 

CTGATTCATGGGTGCTTrAAAATCTGAGCAGTCTTTCGCATTTACTGCCT 

rTGATGGCCCATCCCACCAGCTAAAGTGTGTGGCCACTGCTTACAGCACC 

ATGTGATAACGAGTAAC^^TGCCGCCCAC^CTCTTCTAGGAGCAG 

-CAGTAGGACCTTCCAGGGGTTGCAAGCAAACCACAGCAATATGTGGAG. 

3TGGCAGAGGATGGCCCCAAGAGGATGTGGCAGCGGCTAGTGCAGCTCAG 

-^GTCTGAGAGGAAATGCTGGAGAGGAGAGCCCAGTCTGT^GGCA. 

GACAGCCACAAGGACTTCAACAGCTAACATGGCTGAGTGGACTTTAT 



rCGTCC^CAAGCTCTTCTTCACTrTTGTTCTAGGGCCGCTGACATCCTCT 

SSgcc^^gtc^ccaggtcctcacttactcaactg 

TGCTGG^CCTGGTGGACCACATTCTCCTTCACCCACCTTTTGCTGCTCTC 
-r^CTC-CCACATGTTTCTCTCTTCTCACTGGCTACTCCTCTTTTGTCT 
"^STTAGCTcSmCCTCCTTCC^CCTCACTGTGCTGGTGTGCCC 
A^-AGT^^GCTACTCTCTCTTTTCC^GTGGCATTCATTAGATG 

GAATATGAATCTCAGGAAAGATTTGGATTrATTTTTAATTAATTTTTTTA 

AATTT^ATTTTAATAAATGAGGTCTCTCTCrGTCATCCAGGCTGGAGTGT 

AGT^T^GAGTGATGTGATTATAGCTCACTGCAGCCTTGAACCATGGGCTC 

2^i-^~c^GCCTOVGCTTCCrTGAGTAGCTGGGACTACAGGCATGT 

G^C^CC^ GCCTGGATGACTTTrTGTGTGTGTGTGTGTGTGTGGA 

GGTCTTGCTCTATTGCCCAGGCTGATCACAAACTCCTGGCCTCAAGTGA7 
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- . - AAGATTCAGATTTTGTATTCAATTGACTGTT~GACAT ri- 

TTGGACACCTAAGAGGTATCTCAAATATTAATTAAC^GGCCA^AATACA 
3AACTTT TGACCCCTGCCCCCACAATACTTGCC™a-^GAC^— 

^^;?pI^^=X^ G ^ ACT ^ CCCTC ^CCrATGAATGCC 
;--^I-^;:STn CCC ^CATCTCCTACG7TGACGCCATCAGCTAG7 
- £ -;---zTJ A ^CCAGAATATAATCCT=ACCACCTTCTC^CT^ 

ovj x *_AA7 ... i ^ACAGCAAGAGTGGCAIT 1 1 : 1 ' ■! ^^X^.,: 

-^---^^^^CAAGCTCCGCCTCCCGGGTTCACGCCATTCTCCT 
GCC . v.AG<_CTCCCGAGTAGCTGGGAATACAGGCGCCC^rcirr—r'"-r'- 



TCCTGCTCCTCTCTCCTCTCAGCCCCTCCACCCATTGTTCATGCTGCTTr 

^^^^^^'^^'^^^'GTGCTTGGCATGCTGTACCC'' 
AGA IEi:i^ GA "GGCAGCTTCTGTACATTCAGCCAC^ 

2^-;;XC GAGG ?i^? CCC - 3GC =ACCTCACCTGAAATAGCACCTCCG 
A Gw%s "*" CCGGTTATTCTCCATCCTGTTCTGTTGCTTGG7GATT' , "" 

AAGAT ~ ~ ~ ~ r^cr A TTrr^r ^^ G ^ GAGGAGGAAA '^ GAGG '^ GG ^^ 
A C GA I^^^° GAT TTC7AC7CCCGCC7GGAC7GCAGGCACAGCAC7G 



£^ExE^ G iI A ^ a< ^ CTCCT ^AAAGCTGGTAGCAGC7G7C3GAGG 

iG7G« i « x uT7ACC7GC7GCAGG7GCAGAGAAG7TGACTTCACAGCCCTr 

CAGAAAGAC7GCCTTCTTCCAGTTG7ATTTGTGTAC7TGC77GGG7G7GG 

GGAGGA77C7CAGCTTTCTCCACTCAAATTATCAGACCC7TrCCA7T7AG 

F^ AGAC ^ mCCCTOT CCAG<K:CAAGGGCACATAG7ACAGA 

AGGGAG77GTTACCCAGGGAGAGAACTTGGCTCTAAACC7GTAATAGAAA 

GG ~ AG ^ G ^^^^T?^ A ®^^^^' I ^TGATCT77GGC7CAGA7CCAGG 

AA77GGAACCAAGGCTTTTGAACA7T1^AA7GCAGGGGA77AAAAAAATG 

A7ACGAG7CA77CACGAATATATTTGCTTAACA7C7AAAGAGA7CCC7CA 

AAACAC7AGAAAAAA7AAGAACAAAAATCTAATAAAACAAAATTTGTTAA 

ACACA777ACCAAATTTTTTTTTTTGG7AAAAATTCAAA7G7CATAAATA 
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AGG77G777C7AG77GA * 77CA77AAG7A777A7AAACAACGCATCAG*i 

3A7G7 C 7A7AAA777C777ACGGAAGA7GGCAAG7AG7GGAA77G C7GAG 

~CAAAGAACA7G777AAAAAACCCAAAAAAAC7AGACGC7ACCAA7777C 

-r^~~ CAAAA7GGCC A7ACCCAC77AC CCATACAGAGA7GATT7GGAA7C7 

GGCTTCCTCACAAGGTGAGATGCCTTCACAGTTTCATTCTTCCTGGCATG 

" r C77CG-7777G7A7C7GAGAGAGC7GGCAGAA77G7G7CAC7AAA7CAA 

GGA7AGAGGG7CAAA7GACAGC7CAAGC7CACAGGCACC7C7GC777C77 

CCCAGACCACCTGCTTTCCTGCCACCAGCTCTGTTCCATCTTATAGAA7G 

G77GCCAC77GGG7G7C7GC7CCGACAGCCA7G7CA7CC777GCAC7GCA 

G77A7GAAGCAGACAGAGC7AGGAGAGGGGC777GCCAGCC7C7GCCC7A 

G C7~GGAGAA777CAAAGAAGGAGGG7A77GAGAG7GAGC7GC CGAAGAC 

-GGCAGC7CCC7CAAC7CAACAG77G7CC77CCACAAGAAG7CAGA7ACA 

T777777GGGA7AAAA7A777AAAAA77A77A7777A777C7GAA7AA7A 

TA777ACA7GA77CAAAAA7CAAAC7G7AGGCCAGGCA7GGC7GC77A7G 

~ C^G7AA7C C7AGCAA777AGGAGGCCGAGGCGGGAGGA7CAC77CAGC C 

CAGG AG77 CAAGACCAGC C7GGG7AACA7AG7GAGAC C C7G7A7 C7ACAA 

AAA777AAAAACAAAAA77AG77GGGCA7GG7GGC7GA7A7GG777GGC7 

C7G7 GACC CAAC7CAAAC C7CA7G77GAA7777AA7C C7 CAA7G77GAGG 

GAGGG7CC7GG7GGGAGG7GA77GGA7CA7GGGGG7GGG77C7CCC77GC 

-G7^C7CA7GA7AG7GAG7GAG77C7CACAAGACC7GG77A777GAAAG7 

G7G7AGCACC7CGCCC77CAC7C7C7CAC7C7CC7GC7CCGCCATAG7AA 

GA7G7G7G7G777CCCC777GCC77CCGCCA7GA77G7AAG777CC7GAA 

ZC C7 Z C CAGC7A7GC77CC7G7ACAGC C7G7AGAAC7G7GAA7CAG77AG 

ACCT- 77C77CA7AAA77ACCCAG7C7CAGG7CA777777A7AGCAG7 

G7GAGAG7GGA7GAA7A7AG7GCCA7A7G777G7A77CCCAGC7ACCCAG 

GAGGC7GAGG7AAGAGGA77GC77GAGCC7GGGAG777AAGGC7GCAG7G 

AGCCA7GAC7G7ACCAC7GC7C7CCAGCC7GGG7GACAGCGAGACC77G7 

C7CCAAAAAAAAAAAACCCAAAC7G7G7AAAA7G7G77CA7AAAAG7G7C 

77GC7CCCACACC7G7CCC7A7A7A7C77A77CC7CAGCC7CCGACAAC7 

ACTT7A77CA777C77A7G7A7C77CCAGAA7CAAAAAAAAAAAA7CAAA 

7ACAAGCACAG7GGAA7G7A77GCCC77CTrCCCC7CCC7T77G77ACA7 

CAGAG77AGCA7A7CA7AAA7ACGG7C7GCA7777C77C77777CAGC7A 

7CAGCA7G7777GGAGAGGA777CA7A7TCG7GCAGACAGCA7G7A77AG 

7CAG7 C C77GCA77GC7A7AAGGAAA7AC C7GAGAC7GCA7AA777A7AA 

AGAAAAGAGG777AA77GGC7CACAGC77CGCAGGC7G77CCACAGGAAG 

CA7GG CAGC A7C7GC77 C7GGGGAGGC C77AGGAAGC7777AC7CA7GCA 

GAAGACAAAGCGGGAG7GGA7G7C77A7A7GGCAGGAGCAGGAC7GAGAG 

\GAG AG AG AG AGAGAGAAAGGA7GC CACA7AC7777AAACAAC CAGA7C7 

" G 7 GGG AAC 7C7G7CACGAGAACAGCACCAAAGGGA7AG7GC7AAAC CA7 

-CA"AAGAAC7CCACCCCCA7GA7CCAATCACCCCACACCAGGCCCCACC 

-CCAACA7CGGGGA7TACAA7TTGACATGAGAT7TGGGC7GGGACACAGA 

ACCAAACAA7ACCAGAG7GCTrrC7CA7TCT7TTC7ATAGCTGCC7AG7A 

T7C7A7G7CC7TTACTrCATT7AGGCAGTCTCTTG7TGATAGACAC7TGG 

G77AC77CCAA777TTCCTATTACAAA7GA7G7GCAA7GAATAATT7TGA 

-CA7777CCA777CACA7GGG77A7GTCCATCTGTGGGA TAAAT C7CCAG 

GAG7GAAA77GCTGGATCAAAGGGGAAGTGCAC7TGTGATTTTCATAGTT 

AGCAAA7777G77C7A7AAGGG7CA7ATGAA7T7A7AG7CCCACGCG7AA 

^A777AACAG7GGGGA7TrCCCGACAG7T7GACCAACAAGG7C7G77G77 

AAAC7777GA7777TG7CAATC7GA7GGGAAAA7ACTAGTA7C7CAAAG7 

GCrrrrAATT7GACTTrCT7ATTACAA7G7TAAGCA7CATTTrAC7CTGC 

-CAAGA7CAAA7AG7AI IT TC I TT7C7G7GAACAGAC7G77AAGA7CCC7 

^GCC^C77GTTTTGCTGGATTTTTGTTCTTTTTTTTCAAATGTTTTGAGG 

CAG7^"C777ACA7G7GAAACAAG77ATCTCTrrATC7GGGGTGTGAG77A 

CAAC7AC7777CC7C7GGC7TGT777GCGCTTTGACT77GCT7C7GGTGA 

T7CCCGCAA77C7GAAAGTGTAC777T7GCA7CATTCATTCT7A7ACACC 

CA7GC7C77G77CACGC7GG77CC7C7ACC7GAGGGC77777C7777C7G 

c ^^ ATCT33< ^^ TCTTTrrGA GAGAGAG7C7CAC7C7C7CGCCCAG 



owAGCCCACCACCAAGCCCAGC7AA777G77GA777A777A777A7. . . 
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x i. i uii ^^AATGTGGGAGCGAATGACAGAGGGTTTCTTAGA 

^T GA ^S AG -^ CTG ^ 7( =AACAAAAGGGC7GGGGA77G7CCC7GAAGAA 
CCAGAGC7GAN77NCA7CAGGAG7AACANAGG7AGA7AG * V * AAGAA 

S^ G 7Z^ G JJ;fACGCW77C^77A7G7CCAA77 

.vj(_A(_«. . i . *-AATAGAACC7G77CCGGC7TTTC'TAGGAGGGGSr'"""" 

^77^ ^ 1 AGCA ^^ C ^ A CC7CCAGC7GTAGAG7GGGC77A7CATC~~~ 
CAACACGCAGGACAGG7ACAGATT r 1 rTTCTITGAGGCCCAAGGCCACAG 

GTATTT7G7CA77ACTT7C77C7CC77G7ACAAAGGACA7GGAGAACA^C 
jAAAGAAGGfMf;TrTT , r:'rrar!'T-rn/^^^» ^» /-.. 



: A SH?^^;^i:i^ GGAACAGAGA G7AGCCCA7GGGAAGAAAC7GGGA 
:' A ^^"^^1^ GCTGGGAAAG TGGC7CCAAAGAGAGACAC7CA777^G 

AT- -AG'CACAGCAGTGTCAATTGGAGGCCCTGGGATCAC^Cr-A 

CTACCC3AT7CCAAAGAAACAGGA7T7TC7TGGCC7GGCTGAGAGCAAAT 



ATA77G7GGAAGAATTGAC^TTGCACTATGCCCAAGAGACTTGGGTGGAC 
ATGG7CC7GGGAGTGC7TGAGCCGTCTAATTTCTCAGGGTCACACTCCTG 
7TAA£AAATGCACTC3GCCAGTGCAATCAAATG7GCCATrrCT 
AG777G7A7A7TCC7T7TTAA7ATTT7T777CAC7TG7G77GATCAT7TG 
CC77AAA77AACTT7C7ACTTTGTTTAAAACA7GGAGAATTAGCAAGCTG 
CCAGGAGGCCAGGCAGGGAAACCAGGATGTTTCCATTTACCTTGTTGCTC 
CATATCC7G7CCCTGGAGGTGGAGAGCTTTCAGTTCATATGGACCAGACA 
GAAGG7 ^^^ G ^GTGAGTCCCGGAGCGTGCAG7TCMT^TCG7 
ACAGvj7>jCA7CGTGCACATAAGCTTCGTTATCCCATGTGTCGAAGAAGAT 
A ^II5 TGAAATGTGGAG ^^ TG ^ 2 ' :t TAGG7ATAAAATCAGAAGGGC 




V 1 /// 
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C-7GGGG7GGC7AAGAAC _C7GGACC7GACAC7A77AGACA7GGGt7CC. . J: 
TACTGCTCACTGTGGCCGAGCAACAGAGC7AC7TAGGT 
AAAATGG7GA7GG7CATAACAC7AGCC CACAGGGAGG77ACGAAC CT CTG 
37GACAA7 GTAAGTGAAAGGC C C C7GAGAAAGAG7GAGGGAG7TG CAAAT 
G7CAGTAG C CA7CAAGATCT7C7T7AAGAA7AG777C CAC7AAAGAGA7G 
A77GC777GG777CCAGCCT7C7TTGT7TTG7C7CCCCGCTGGGCC77C7 
ACC777AAAGGGC7TTGGCTCTGGGGGAATTGAGTTGGCTGGGGCTTGAT 
GACT7 Z CAAGAGGACACAAG7GGAGATC7AC7GCC7GC7CTTGGCTAACT 
A C C 77 C 77 CAAAGATGAAGGGAAAGAAGGTGCT CAGGTCATT CT C CTGGA 
AGG7C7G7GGGCAGGGAACCAGCATCTTCCTCAGCTTGTCCATGGCCACA 
ACAACTGACGCGGCCTGCCTGAAGCCCTTGCTGTAGTGGTGGTCGGAGAT 
TCG7AGC7GGATGCCGCCATCCAGAGGGCAGAGGTCCAGGTCCTGGAAGG 
AGCACTGCGGAGAGAGCGAGGGAGGGAGCCTGGTGAGGTGGTCCTGCCAG 
GAACCATGCTTTGACATCAGAGAGTAGAAAGCTCAGAGAGGAGGAAAGGG 
CTTGAAAGAATCCCGAGCTTCTAAAGATCATCCCTCTCTGGGCCAGGCGT 
GGTGGCTCATGCCTGTAATCCCAGCAC7TTGGGAAGCCGAGGTGGATGAA 
TCATTTAGGTCAGGACTTCAAAACCAGCCTGGCCAACATGGCGAAACCCC 
T7C7CTAC7AAAAATACAAAAATTAGCTGGGTG7GGTGGGGTGCACCTGT 
AAT C CTAG CTATT CAGGAG ACTGAGGAAGGAGAATCGCTTGAACTCAGGA 
GGTGGAGGATGCAGTAAGCCAAGATTGTACCAC7GCACTCCAGCCTGGGC 
AACAGAGTGAGACTCTGTCTCATAAAACAAAACAAAACAAAACAAAACAA 
AATAAAATAAAATAAAATAAAAAGATTATCCGTC7CTGAAGCTCAAGGAG 
377AAGGG7G7ACTCAAGGGCACACAGCAGG77AGAGGCAGAC7CAAGAT 
7 AGAA7 G7 GGG C777CTGACAC C77ACAGGC7A77C7777AGAATAAA7 C 
C C A777 C7 AC777GT7CATC77TTT7GT ACA7GC C C CAC C7 ACACCATAC 
A7G7A7ACC77C7CTATATCT7TTTGTATCCC7AA7GCTGTCACACTA7G 
A777GC77777CATGQXGATGACCATAACAT777CCATTCAC C7AT GCTC 
ACTCAGCAAG7A7TCAATTTTTCTACACTGT7C7TT77TTTC CTTTT TCA 
TAACACTG7C7CATAGGCATTCTGCAAATCCTGTGAGAGTACTTTTTGTG 
AAATG77ACCACTTTCC7CTTATTCAGAGAAGC7CCGTATTAAGGCTTCA 
CTGAGG7TGCC7TAAGGCATGATAATGGTTCAAAGGCTTGAAAGACAGTT 
AAAGAGACC7GTAAG7GCACAAAAGAAAGTTGAGCAGGAGAGAATTTCCT 
GCCTGGAGCAGAGCCAAGCTGCTGGAAGAGGCAA7GGGGGCAAAGGCCAG 
G CAGACAAG C C AATGGGCTC CTC CCACAGC7G CAGC CAACAAGTTATGCC 
AG7 C77AAAAC 77 CT AAAGAAATA7G777T7 AACAAGATTGAGGACTGGA 
7TATGAGGCTAGGGGAGGCTATCACAAACTGGAA7AAAATAAAGCCAGAG 
AAAAG7GGC7GCC7TCCAACCTGCACAACTGACCTAGCTAGGCTGATGGC 
GGGC C AC CT AGGAAGGCTACTGAGCATCAT ATAAAACAGAAGGGACAGC 
AGGAA7A7AACA7GGCTCTTTGTAAGGATGAG7C7GAAAAATGACCATTT 
3CTGC C CAAA7GC C C7TAGCTACAACTGAAAATA7TTCAGAACTGGAGGT 
TGCAGGATGCTGGAATCTCAGAGATCATCCAGCTCAGCCCTTTAT'riTTC 
AGATGAGG7CCAAAGCGGGTAAAATGACTTGTCAAGGTCAAACAGCAAGT 
GAATGG7777C7TTCAAGTCTCAATTCATCTTTTTGTTTATATCATCTAT 
GTCTTG77G77A7AAGCTTCACCCCAGGTAGCAAAAAACTATTCTACTCA 
AAAGGGGTAGACATATGTTAGTTCTCAAGATCATCTCTTGGTTTCAGAGT 
77AAC7 CAAGTGATTGGCATAGGCTGAATCCATCTCTTAAAAGGATAATC 
AAATT^ATGTTGAAGACTTGGTTGTCTTCCTACTATGAAATGGGAAACAT 
7ATCAC7AC7CCTCCCCTGTCACCACCAAGTGTGGCCACCACCACCAACG 
77AG7GAG7GACTGTGGTGATATGATGACCAAG7GGCCAGGTCAGCAAGT 
GGTGCAGCC7GTGTCTCACTGGAAGAGGTTAAAGTCTTTCTAAAACAAAA 
7ACCATGGCATCAAAGTGGCCCAGAACTCCCTTCTTTGAGCTTTCCCTGT 
G7TAGAGC CC77CCTTGGGTTGGGAGTTAAACC CATAGTCTTACCTTCAT 
C7G777AGGGCCATCAGCTTCAAAGAACAAG7CA7CCTCATTGCCAC7G7 
AATAAAAACAGGGACATGTCTCAATTATGTCTTCTAAACAGGTTTATTTT 
T CCTTCCCTGTGTACAAGACTTGACTGTTCATAAGAAACTGCAAACAGCC 
^GCCTCTCAAAGCTGCCTGAAACACCTGGCAAGTTTCACAGTGATATGCG 
C AGAACAG7 C CAGAAGGCAGATTCTAGGCCTGGCAGGTGGGCACC CTGGG 
~GCTCC™G77GGATCTTGAGGCCTAACCTCTAGCCCAGCAGAGTCAGCr 
AAAA7C7GAGC7CTCCCTCTCCCTCCAAGCCACACTTTGCAAAGGGATTC 
C77G7A77C7GGGC77GGAA7CTTTTCTCCCCATTTGCC7C7GCAGGAAG 
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TGGATAGCCTCCAGGTCCCAAGGCtGGAGC * 

^g^^S^^^ A< 3GGGAAAAATCTGGTCTCCAAGCACAGA7 
AACn.^. .C.Ai . GAAAGAGGAGAC CTGC7CATG77AC7GG7C r CI nr ~ 




^Srf-I^— ^TG777^777AG7A7A7G7GGGACAAAG7GGAA 
^ACACACA^TTrATACATGCACATACTTTTCTrCATTCAC — TG~A 
CTTAAGrTTAGGAATCTTCCCACTTACAGATGGATAflATCGGTACAATGA 

GvjTGC - CTCTGCCTCGGGAGCTCTC7GTCAATTGCAGGAGCCTCTGA 
GGAGAAAATTGACCTTTCTTGGCTGGGGCAGAGAACATACTCTATCCAGG 

CCCTGCAAGACCAGGTGCTGTTACCCTAGrrJVATrTr ji tv?™ TV** * ^» ~ 



?CTCAC7CTGTCGCCCAGGCTGGAGTG<^GTGG 

2^^:E C3C ^ CCC! ^ TTCACAC CATTCTCCTGCCTCAGCCrCCCCA 

:^^-^ A S: A ^ OTCACCTGCCGCCAC GCCC3GCTAATTTrT7GTAT 

. . * iAui rtGAGACAGGGTTTCACCATGTTAGCCAGGATGG7C T 'CGATC' r C 

ITGACCTrGTGATCTGCCCGCCTCGGCCTCCCAAAGTGTTGGGATTACAG 

3CG7GAGCCACCGTGTCCAGCCTGTAACACTTCTTATAGCACTGAGTTGA 
:77GC7CCTCCTGCnTCOTCCAt^aarT rt a a i V i^^^i^ 




- - — — ~ * w i. wwww u i UKUti i WTAGAGTTTGCTGCTT 

G^7GGG7GM7GGGAATTTGACAGCATra 

CAGG7ATCG7GC7CGC7C7GCAT7A7CCAASGC3AG7GAGCC7^7GCAA 

GTATTTGAGACACGAGGGAAATAGGT7C7AC7G7GGGAAAAAGAGCAT77 

CATGGACTTGC7C7CCAAGCAGCCTTCTGATTTTTAATTTGGCTCCCAG7 

A7C77GA7A7CAGGAGTCAGTCACAAGAAC7CCA7CTTTAG7AAG7TATA 
T7TTeCAC&fy:a a a Trra a a a r«i -t< ■■ i » 




TCTGG7CCTGCAGA7AGCTAAAAGCCCTGTCTGGGTGGCCTAGGGACTCC 
7CTG7777GCCTCCACAGGATCCACT7TGCAAATTAACCAC7GG77CTCC 
CGT7G7AGGAAC7GCCACCTTCCTCAGAGCCTG7CTTTC7TCCTTCC7TC 

C7TCC77CC7C 1 1 1'Ll 11TT CTTT C7C7CTC7C7TTCTT 1 L1 1 1 1 ' dTTT 

C7TrC777C777C^Cl l lTCT77CT77CTrTCrrTCTT i rC'l l TTC i :'T i rCT 
7CC777C777C7CTITC7CTCTrTC7CTCTTTC7CTTTC7T7CT7TC7C^ 

c7ccc7ccc7ccc7CTC7C7crrrcTrrc7rrrrc7TTC77~^c^c777^ 

C777C7C7C777C777C7CCC7CCC7C7C7C7C777T7C777G7C7C7CC 
C7CCC77C7C7C7C7C777C7C777C7C7C7C7C7C7C7CC7AGACAGGA 



PIG. 3 (44 of 52) 



PCTYUS98/16102 



WO 99/06426 PCT/US98/.6.02 

ACCT^ATCCCCCi-.jGCTGGAGTGCAGTGGTACAATCATGGATrC^;- 

"GCATGATCACAGCAGCCTCAAACCCTTCCTCAGAGTCTTTATGCGGCAA 
■'"CAGCAGGGTCTGGAGGGTTGGTGGCTCTGTGAACTCTCCTGACAGAACA 
-AGAGA7G7C7T7GG7C7G77GA7G7GAT7ACAAGC7GAACGAAGGAAGA 
-CAAAGCCAGTGACAGGAAGGGAGATATGCAAGGGACCCGAGCATCAGCT 
C^GAGTTAGTCCATTCTGCTTCTGGGACTTGGGATACAGGTCAGAAACCT 
-GAGC-CTACTTCTCCATCTTCCAATTG7AGCATCCAGGACCTCAGAAT 



— GAGCTT- .Atl lL.LU4H.UV.L«fti hj**»»w»*.w>^»w— - — 

rfGCCAGCTAAGAGGAGCCCTAATGATTGTCTGGTGGGATATGGTGGGAC 
CACAGAGATGAAGACATGAATAGCTATTTGAATGTGAACAGCAGACGAAG 
. . n^.trTar^ifvsfTTryiaaf^Trj&CTCATCCAATAGCACAGTGTGGT 




TGAAGCASi-At.iiVjIAliwV_KUVjl hjuiiuh^v.^^* — - - -- 

AGGGAAATTTTGGAGCCATGGGGCAATGCCCCCTGACGTAACAGTCTCCA 
CAGTTCTGCCATGTCTCATCCTGGCCCTGTAACCTGGACCCAAATC7GCT 




GCACr-CCTTCCGATCCCCAATGAGCAGAAAATATCTCGCTATAAACATA 
GTTGGCACTAAGGGAGGGAGTGGAAGAGTGATGATGATGTAGATGGTGAT 
GTAGCCCCAAGGAAGTGGAACAAGCAGAGATGGGGAGCTGGA^TGCCAG 
GATGCTCCAGCTTTTGGGGAATTATTCAGCTCTTGAGTCACTAAAGCCTT 
'CTCAGC-GCAAGTTCCTCTTTACCCTGTCAGGTCATTCTTCCAAGACAG 




"AA7CATG5<jv- i I Uvii—ftvaUi-wj 1 1 ax wwvjv» i. * . — — 

J — CAA T CATT , *TGAACAAGAAGACAAGCAAAATAATCATGGG7TAGT' 
T7A7A77A77G7G7G7ACA7GCAG7GA7G7C7G77C7TrG7AG7GAGC7G 
^-t.^^-- C~" , G77CACCC7C77GC77AGAACAGAAC7AAGCAA7C7GCCC 
-CAACA7777CCCCAATTTCCCA7C7CA77C77GGCAC7GGC77CC7AA7 
A777G77C77A7GAGTCATrT7CT7G7A7CA7T7CCA7GAG7CCC7C7GG 




_____ i ~ a ^tti-tt a t rrrr a (TTT CTGGGAG ACCAA7 AAGAC C 




AT?GCAGGGCTCTCC7AC7C7A7G7CTT77C7CACACC7AG7GGG7A7AA 
3AGAGGGGACCACAAACAGAGGGGGC7C7GG7ACCAC77A7CCAGGG7C- 




jGG7G7GG\- 
TA7CA77TTT7 



-GGAC77G7CC7GCGGTCCCC7GTCT7A7A7CA77G7A77A 
TA7CA77TTT7CTTACATACAAATrrAGAAGCAA7AC77AA^^ 

rrcGTCC rAT7GAGCACC7AC7AAG7GCCAGGTACC7777777CCC7C 

A77A7C77A^7AACTCTT^TAA7AACCTTTAAAG7AGA^ 

CC7A7GCAGAAACTGAGGT7GAGACAA7AAA77A777AAGACC 

1 . > wmmi ir«r»rr;irTraAATATGGG77AACTGAAC 



CATTTGA 1 




T CCCC^7GTT7CCACA7TCTCC7TTTAC^T7GA(3A7^7^77 
CCCraTCTCCCTCCTTrCTTCTrCT^ 




^C7CA^C^A7AAGTGG^GC777GTrGCTGGAGGA777CCT77GTCCT7 

T7A77CT7C777AAGACTTTGTCA7AAC7G7CAAAAGCAA7CCCT7GAA 

G7A7C7G7CC77GGAAT7GTGTGC77ATGA7GC7C^AAAA7AC7C7 

C7AAAGC7A77A7AAATGC7 

GGC7AGC7GCAAC7CTrGAA7ACAAACACA77CAGACA7GC^ 

'""GGC'*" ""CAAAAAGAAAAAAAAAAA7 CAA777A7 AA7 AAT7C7GATC _ i 

77GC77A7TTCCACAAAC^^ 
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- * C AATAT7CTCTTV . TCTCT CAT ATC CATTTT C C TTAC m rrr^r * 
-ATT-AA* . . -GTGGAATGGAGGCTGCACTGATTTAAATTAAAAaV^a^ 

-";:-^^^TOTGTCTTCCATGCTGGAGTGCAGTGGCATG 
iE?i^5^i^ TG ^^CTCAACCTGCTGGGCT(^GTAATTr-r^^ 



^^^^^5^S?^I^ CGGACGTGAT CCGCCTGCCTCGGCCTCCCA 
^TAAATCTATATTACATGGGAGATCTTAATGTACrTr4^^^^I 



A I^ZZ?iS^^Hl^ A ^ CTGCT AATCTTCCAAACCTCATTCCTGTGACT 
^XX^^S^^ T ^ C ^ TGAAAATACTGACTGT ACACTTA 
I^^^^^Pf^ACCACCTGACCCAGATTCCTCATTGCGCTCC 



^^A^AeAGAAGTGGCAGGATGCCCAGCCCCACAATCAGAGGGGAAGGGG 

cagagccacatgtatgaac»tcctctccc^^ 

HIE£zZ AG iII^ GGaCCA AGGAAACAATGTGGGAAGCAAA^ 
Z;t; CTTTGCATGAAGACTGAGCAGTTTTACCAGATTCCCAGG 

AOU. i AGAAAAACTATTTCCTGAGCCACTCACCTTTAGCCCTafia a a r:-r 
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GAAGTACTCCTAC7T~ .CACCGGCTrrCCTGTCATCTCCCTC^GCw^ 
C^CAGCCCCCTCTGCACAGGGAGGTTTCCTCCCTGCTGCTGCAGTGCTT 
~ 3T AC ~~GTT AGTGGT AC CTGGACACAGGTATTGGTGTC CTTGTCTCAC C 
i^r^ACATCACTGTAAGCTCCCCAGGAGCAGGCTTCCTGTTTGACTCAC 
^-^-AiTr-TCCACrrCCCACCCTGTAGTGCCTCAAGCATTGAGGACAAT 



kAGGC *TAAAAATGCA(j i 1 Avj i u^vj^j i uunvj\j«wvw * *^ 
cjGAT"*CCCAGGAATCT3GATCATTCTCTTGAGAGCTTTCCCTTGTCTCTC 
t-aaaACTCACATCCTACGGCCCAAATAACAACAAAAAATGGATGTAAAT 
-C-^GAAATAACTTGTGGATGGGGGAACAAGGCCCACCCCCCAGATCTGC 
CAGAAGCTTCAGGTGAGGGTCCCAAATGCCAAAAAGTCTGGTATCAGAGA 
i^,^^^ r -, r . T r.arrTr^^CACATGCCCTTTGCTGTGTCACTCAAGGA 



TTCTCAATTTTTTATA1 1 i Ili liu * * 1„ 

TAT^ATTTATTTATTTATTTTTAATTTT^ 

-rr^^AAGTTTTAGGGTACATGTGCACATTGTGCAGGTTAGTTACATAC 

GCATACA^GCGCCATGCTGGTGCGCTGCACCCACTAACT^ 

ATTAGGTATATCTCCCAGTGCTATCCCTCCCCCCTCCCCCCACCCCACAA 

CAGTCCCCAGAATGTGATGTTCCCCTTCCTGTGTCCATCTGATCTCATTG 

JVr TTAAAGGTGGAATCTCTCAGTGGGGTCTAATCTGTTCAGAAAT 

ATCAAAAGAGTATCCTTGGGAATGACTGGAATTCCAGAGTCATCTGGTAA 
'CC--ATAAAACAACTCCTGGATGTCTCTCAGCACATCTCCCACCTTGAA 
CGCAGGAGGCTGGTTCAAATGGAGGAGCATCGCrCTACTGCACTTTTTTT 
S^TGGCCTAAAGTGCAAAAO^TACGTTTCATGTAAATA^TCA 
ACTGCAAATCGCTAGTTATGCTGAGCCCTGTCCCGTGCTGTGGACACAAA 
^C^GGCrmCTCCCCGCCCAACAOVCACATAACACACA^CAA 

Ja^taaaaacatacatacccccaa^cataacaa^ca^ 

ACAAAATATATACACACAACACACACCAAACATGCCCACAAACCTGTGTC 
rirt&^TiGATCCTACTGGTGGGTTTGTGGTCTCGCTGACTTCAAGAATG 



^ACAAGiC . .-lA^^AAV-i^^; ~-rrrrr*r*rTCACATCCT 




^;;;i^ cc ^ CACAATC tcatcccatcccaagaggtttatcaaga 
aStctctaaagaStgtgtctccttatggaat^^ 
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^GCACAGT^ 

; : ^^ atggaaaata gcaggaaatgagtgatcactgataac-' ; 

^ A ^: A P TACTA raGccAAAGGAATA^ 



CAGAG 



^ a ^^ tcagctctacc »cctgctcctgctcaccgtcotS 
, i;^, ? AACTGC 1 1 1 1 1 ^ t taggcaaagattaggctactgcagagact'- 

AGCAAACTTCTATAGAAGGTGTCAGATGGTAAGTATT^AGGCTTTrrTT 

ccgcJ gac " atctgtaaa ^^ g ^gtagtgttg(^taaatatag1S 




so fur 
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~ AAAGAAAAATAC CT AL j^ACTAATGGTTCACTTCTGAATAGCACAT,.., 
"CATAATGATACAAGCACTCATTACTAG7CTAGGAAAATGAAGATATAAT 




^TCTGATTATTTTTTGCAACAAATTrTGTAAAAC CATTTGAC7GTTTAC 
ATGTAGAACTTGGATCTTTTTTAAAAAACACAAAATAATAATACTATTAT 
~TTTTAACT GGATTTTTGAAAAAGAAGAT AAAAGT CT C ATTTTAGT AATT 
AAAACT CATT C CAGGTT AGT C CACT CAAAACTT AT ATT CGAAAATTAAAA 
^-TTGGGAGGCTGAGGCAGGCAGATCACCTGAGGTTGGGAGTTCGAGACC 




GAGAATTGCTTGAACCCGGGAGGCAGAGGTTGCAGTGAGCCGAGATCACA 
rrATTGCACTCCAGCCTGGGCAACAAGAGTGAAACTCCATCTCAAAAAAA 
U ^V: ....Aits -nr.* » * * rrT"rmi ArtTTn&GTTTGCAGATATTCAT 



TATGCTCATTTTTAACTTGTATGTTTGGAAAAr^lUAi^i^w^^ 
OT^GGGGGATGAGAAAAAAAGAAAAACATCAACCCCACAGCCCATTCAA 
TTTTCAGCCCGACCCACAGCTCCGGGGAAGGGCAGCAGGTCCATCCTTCA 
^CTTTC r TCACCTCTTTCCCCTCCTTCTGGCTCTTCCACCTCTAAGTTG 



rAGCC'-AAGAAGAGGCACTGGGAAATGGAAAAGTCTTTTGTACGXGGTAC 
CTGCCGOGGAAGCTGCCATGAAGACCTCMCCCCACGGTGGGGAGGGAATG 
-CCAGCTGAGGCCTCGTGCCCATGCTAGGATAGACTCGTCCAGACATGTC 

Xggtggtctgacagggcaagcagcaggaagtcatgtatgagtatgaactg 

ATC^GTATGCAAGGGCGGGGAGAACACGCGGAGGAATGGGGCGTGAGAAA 

ACAGCACAGTACGTTTCTTTAGCAGCTGTCTCTGCTCAGCCATGGGAGTC 

AC^GAGAAAGAGGCTTGGAGGCGTTATTTTCACTGTGAGATGTG^ 

AAAAAAGTGCCCAAGACACAGTGAGTACCAGGGAGATGCCCTCTTTCCCT 

ACCCG^TGCAGAATGGCCACAGGCCTTAAAACACACACATGGCTCCTCA 

GSGAkoSCCTCCJ^TGGACACCCGCATTCTCCCCTCKT^G^ 

^GCAGGGCGAGTGCTGGGCCATCATGAAGCrTCACAGGCAATGAGCTCT 

rAGCAATAACAGGAACAGTGCCTGGGGGACTGTAGCTGCAAGACCGATTT 

TCATGTAAGATGGCCTCTGAGGACTCCGAGATACACCAGGCTGAGACTAG 

CTGGCAGCTCCAAGTTCTTGGTCAGAAGAGAACAGGAACTAGGGAJ^TTC 

^.^^^^r.^ a rTir&ATTrCTTTACATCCGCACAACCATGAGGTCCAG 




T^GGTAGAGACTGGGTCTCTCTGTATTGCCCAGGCTAGTCTCGAACTCCT 
GCATGAGACACTGCACCCAGCCAGTATAGTCTTTTAACAGCTTTATTGAG 

gwcggctaacattgajuu^^ 

SmSctAATGTACACACCAGTGAAACTATCACWCACT 
-^CATATCCATCACTCCCAATTTCCTCACGCCCCTTGGTAACCCCTCT 
CTCCCAACTCCCTGCCCCCTAACATCAGACAACTACTGATGCACTCTGTC 
ScA^CTCATTrACATTTTCTAGAATTrrACATAAATAAA^ 

IgStaSctccttcatgtatggcttct^ 

TCATGC^ATGGCTGTGCGTATCCTTAGCCCATCTCTTTGTCTTCCTGAG 

taStacc^ttccatac»c^cca»cotgctcatcca^ 

R ^r.r^r^r.TTTTTTGCAATGACAAATAAGGTTGCTAT 




rTAACTTTTTAAGAAACTQTUHJVAv. nw»N-u««rtww«* 
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^^XilJ^I^JAGAAAACATCAATCTGTCTGATTCCAAAGCCAA 
^C-^--^S^r^^ CT ^ TCCCTACAGAT GCAGAG 
^-^^;^^^GAATTCTGTTCCCCTACCGAATAGAGTTACA 
^cI-^r^?I?^ C ^ CTOC ^ CTCT ^^CCACAG 
. . TCA« i aACCCACAGTCAACTGCAATCCAAAAATATTAATGAAAAATTP 
CAAAAATAAACAATTCAGAAGTTTTA^ 

;^^T^^^^?ACATCCAACCATTGACATCATCATGACTCTAT 
-ATCwAGGATCACCGAAGCAGATGACCCTrCTTCTrtara ti TriTr, W r 



uuliA1 iAft ^CACTGCATCACTATGCCCACATCAGTCACC^CACT 

^^I^*St^ TACmGGG AGGCCAAGGCAGGAGGATCCCTTGGGCC 
CAGG CATT CAAAAC CAGC CTGGGAAACATAGTG Ac; A rrrrrrr-v r-vr-Trln 




™-™«v, i ~«wu««i 1 1 Ai t-t-AGATACAGTGGTGCATGCCTGTGGTC 
.^AGC^CTCAGGAGGCTAAAGTGGGAGGATCACTTGGTCCCAGGAGGTC 
jAGvjwAGCAGTAAGCTGTGATCGTGCCACTGCACTCCAGCCTGGGCAATA 

^IZ£^^^^^^ : ^ a ^ a ^ a ^ a ^ a ^^^ a ^^ttgttctat^catt 

ACTTATTGTTGTTAATTTCrrTCTTTGCCTAATTTTI 1 1 1 1 Ti St TTG 
A °J^ A 9IEH^ CTOTG " GCC ^ TCCTGTAG TGCAATGAGACGATCT 
C^TCACCGCAJ^TCCCGCCTCCCCSGGTTCAAGTGATTCTCCTGCCTCA 

5EF cc SS^ AGCTC ^" A c^^ 

cgaactcctgacctcaggtgaggcctcagcctcctaSgtcctggSSa 

^^^^^^^^CCT^GC^^TAAATTAAAC 
ATTGTCACAGGCATGTATT A A ttt a T»nr.! » » » >rv»» m« o n . _ . 



GATCAGATTCACAGAGGACATCAAAATATTGGCTCCCCAGAAGGGAGAAC 

^S A ^^EF CCTCTO ^ T ^ C ^TCTTCACAGGAATGCAGTAGAAG 
GGACTTAACTCCTGCCCTGAA^SAAAAGGTTAGGCTAGGQAAACAGCTCCA 
A AAj i; j. i iA AAAGGAAGCAACATAGGCATCTACTGGGAGTTTTCTAAAr. 

i AATAGAGCCTTGCTTTAAGACAATGCAAGTGGATGGTAATGAAGGAATG 
AGTCTTAGGCCTTGGATCAACCCTATTAAGCAATGCTGAGCATGGAGCCA 
ATTCTGTTCACTAGA.TTTGCTCAGAAAGGGCCAGACGAGAAGGATTTTTC 

GA ?i£T A I GCTAA ^ TAAA ^^ 

AAGTAATTAATAGAAAATTACTGAAAGCTTTTTTGAAACCIAAAAGTAGTC 
AGCATTGGTAAAarrTY-rara a mcT-mr-n ~« m. _ 
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CAAAAGTTTGAGACCAGc vJTGGGGTAACACAGTAAGACCT CGTCTCTATu ' 
AAAAATAAGAAAAGTTAGCTGGGCATGGTGGCATGTGCCTGTGGTCCCAA 
-rr* • n ~~~r* * <~ i r: a rrrz^ nnr a. nn a rv^ATCGCTTGAGC CTCGGAGGTCAAGG 




GAGACCCTGTCTCACOIGAAAAACAAGGCAAGAGAGAGAGAGAGAGAGAA 
GGAGAGAAAGAAAAGAAAGAAAGAAAGAAAGAAAGATGGAAGGAAGGAAA 
GAGAAGAAAGAAAGAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGA 
AAGAAAGAAAGAAAGAGAAAGAAAGAGAAAGAAAGGGAGAGAAAGAAGGA 
AGGAAGGAAAGAAAGAAAGCAAGCAAGCAGGAAAGGAAGGAAGGAAGGAA 
^sr.r.a&r.^ar.aa^aa a arc* AAGAAAGAAAGAAAGAAAGAAAGAAA 




AGAAAAGGACAAAGAAAAGACCTTTGAACCCTGAATTTCACTTTTAGAGA 
"~rCATCTTAAGGAAATTCATTCCAATAGAAATTTATCCCCAGGATTATCT 
AAATATTTGC7TTT A ' l"l"l"rCl r CTAGTAATTTTATGGTTTAACTTTCTCA 
^GTrTAAGCCTrTAATTTATTTGGAATTTATTTTGGTATGAGAAAGTGTG 
ACC^^TTTTGTTrTACTTrAAAAAAAATGTATTACGATTATTATTTTAG 
AGACAGGG7CTTGCTCTGTCACCCAGGCTAGAGTGCAGTGGTGTGATCAT 
AGCTCACTGCAGCCTTGAACTCCTGGCCTCAAGCAATTCTCCCTCTTCAA 
CTTAGGAGTAGCTGGGACCACAGGCATGTACCACCATGCCCAACTAATTT 
TTTT^ATTTTTTGTAGAGACAGAGTCTTGCTTGTTGCCCAGTCTTGCAAT 
GTTGTCTCAAACTCCTGGGCTCAAGTGATCCTGTCGCCCCAGCCTeCCAA 
AGCACTGGGATTACACGTGTGAGCCACTGCGCCCAGCTGCCTTTTTATTT 
-^AATTTTrCAGATGCTTTGTTGGTTCCAAAATAGCACTTATTAACCCA 
*2 CT ~— . — CCTCTGGTTTTAAATACTGCAAGTTTGGCTTTGAAATACAA 
C CCACTGCCTTATTCAGGCTACATTCAAGGAAATCTGAGACCAAGAGTCT 
GAAGGCCCAGTTTCCTTCCTCAAACCCAGGAGGTGGTAAATGTGTCACTT 
C CACAC"TTCTATCTATTTCTAAGAACTCCTTCTTTCCAAACTCTGACAT 
GCCCCTGGCTCAGGTCTATAGAAATTCCCAGGGTCCACAGACAAAGCAGA 
ACTCACTTATGGGGAAATCTGGGAAATACTTATCTGTTAAACCTGCCCCA 
TATGGTGACTCAGATTGTCTAAAGCCCAAAGCATCATTTTCCACCCCAAA 
CCATTTCCTCCTCCAGACTTCTCTATTTCTGTGGTCCAGAGTCAAGATCT 
TGATATTACCCTAGAGTCCCCCTTCTGCTCTCCTGCATACCCAGATGCCC 
CTCCCTCCCCAGATCCATTCTCCCACCCTCCCTCCCATCAGTTTGGTGGG 
CCCATCACCGCTTCCCCTGGCCCAGGCrCTCCTTTTGTGCGCTTGGAGCA 
^^r-* r-r~* TrTrrrar^rTTCACTCACTTCATGTGGTAATCTGTTGTGT 



GCAGACTC 

TCATCACTGTCAGAATCTTCTGCATC ttt.-i'(-A<- i ai- i v. i a «««««w»w 
"CTAGTGGTTCCTCATTGCTCATTAATGAAAGTCTAGATATTAAACGTAG 
AAGGC CCAGCACAATTTGCCCCTATGCCACCTACCTCTCTAATCTTTTCT 
^^TACTCTGACAGACTCTCCGTCTGTCATTTATGTATTCTTTTATTGCT 
-TCr-CTACTTTrAGTATGAACTGGATTTATGGATTTTTTTAACATOCT 
"TCAAGTATGGAATAAAGAATTTTATTTATTTATTTATrTAT^ATTTGA 




3TAACCTCGAATTCCTAGGCTCAAGCCATCfruv.ivav.v- * 
CTC CTAAGTAGCTATGACTACGGGTGTGCATCACCACATCTGGCTAATGG 
AATAAAATATTACAATGCCTAATCTTAATTCTCAAAATTTTW 
TGTACCTAATGCCCATGOlTTTACTTTTTTv^GTGGGTCAATAGCCCrCA 

CTTTGGCAAAGGTCCCAGGCCCAAGGTAAGGCCTTACTTTT^ 

ATC-^TrTGAAAGAv^TAAGTGCCTGTAAGTTGTACCAv^TTAGGTTCTAG 

GAATTT^CATCAAAGACTTTATCAGACrATTTTCCTCTAAGTTGAGAAA 

GAGCTGGGGGv^GAATATGGCACrGAATGAvTrGAAGAGAAGGCACTGAIA 

^^*/^/^^'-fcrartftTTf^*TGcavAAGAGCAATGAGGAACACCAGCAGCAATGA 



oS— aS^Itag^gXctc^ 
a^tatoSSS^^ 

CTC^GAAGATACATCTCTCAAG^v^TCCCCGCGGCTC^ 
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ACTGCATTCACCTTC7C _ACATTTGCCCTCCTTTGGATGTATATAGA 
GTTrTAAAATACAAATCTGATGTGCTTGCTCTCCTGOTGAAACACCTCA 




TTTTT^AA^^ 

attactcctttcttaatgaccacctgctgtaSS^^S^ 




GAAGG 
AAAAGAGACC 



rtM^^^uu^ui^AUQecTCCCTTGTCCCCTCCACCGTGTAAGGATGCA 
ACAAGAAAGTATGGTCTATGATCOUUUUK^^ 

AATATGCTGGCACTTGAACTTCCCAGCCTCCAGAACTGTGAGAAATAAAT 

TTCTATTTTTCATAAGCCACCGAGTCTATGGTATTTTGTTATAGGAGCAC 
AAACAGACTGATGTGCC4PrraarrBTr!»'r-r!i<rii ~-~~m« » 



^^^ l ^^ ( -A«iCACAGTGGCCCATGCCTGTAATCCCAGCACTTT 
GGGAGGCCAAGGTGGGTGGATCACCTGAAGTCAGGAGTTCAAGACCAACC 
TGATCAACATGGTGAAACCCCATCTCTACTAAAAATACAAAATTAGCTGG 
GCATGGTGGTGCACACCTGTAATCCCAGCTACTTGGGAGGCTGAGGCAGG 
AGAATCATTTGAACCTGGGAGGTGGAGGTTGCAGTGAGCAGAGATCGTGC 
CACTGCACTCCAGCCTGGGTGACAGGGAGACTCCGTCTCAAAAAAAAAAA 
A^£AAAAAAAAAACCAAAAAAAAAACAAAAAACAAGAATTACTATCCCAG 
TTTTGCAGATGAGGCAATGGAAGCTCTAAAAAGTTAAGTAGGAGAAACAA 
ACATGAAATGTATGTCTTATGCTTTTCCTCATCCTATTTCCTCAGCCTGG 



1 1 >^«wUAAUTyeTGTTCTTTGCCACCTCATTCCCCAGGGCTTCTAACA 

CAGTGCCTCATGCATGACAGAGTTGTAAAACAGGTTACCAAGCTGGCTTC 

AGGCAGGTTTGCATGGAACTGTGCTTTACAGGAATACCTGCTCCCCCCAG 

GCCCTGGGTCTTCCTCCTGAGTCCAGGCTCAGACTCTCTCATCCTGCTCG 

TTCTCTCTTGGGGAGCCACAGTAACTTTGAGCAACTTTGCATGGGATAGA 

ATGGCCTATTAGGGGCAGCACAAAGACCCCATGGAGGGAAGAGTACAGAA 
AGGGAAAACGATAATCATATTTTTTTAAf^TGTfVaTTTrr-i-ra a a a 



AGGGAAAACGATAATCATATTTTTTTAAGATGTGCATTTTCTTAACAAAA 

TGCTCTAGTACrTGTCCAGACTTTCAAACTCAAAAACCTAAGCGTCCTTT 

TCTTGAAGATCATCAAAGGCCCCAGTGGTCCTTCAGGTATGTCAAGCTTT 

CTAGAAAATAAAGGTAAGTCATAATCACTTAACACACATGGCTAAATGGC 
CATTTCCTTCTAATTTATCAGCAACTGTTACATATTTrTaT&rTsrsi ti> 
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>Ccncigi 



SCACCCATGTTTCTAAAGGGCATACCAGCCATAATAACAGGATGGGTGAG 




TAAAGGCATCAGGTTTATGQAATGAeTAl^jWaAUAA 

ATACACACATGTATATGTGTGTATATGTATACATGTTTATGTATATATAT 

AATTATATGGTACCATTTCTAATTGACAAAATAATCTATCACATTTTACA 

7TATCAGATTTTACATCTATTGTTCTAAATACACTCAGTCATCAGCCCTG 

TGTGTGGGCTCTTACCCATCCCCATGCACACC7CAGCTCAACCACTGATG 

GATGGATCATCTGCCTATCAGAGGTGGCATATTCAGGTGAATCCATGGCC 

ACAGC-GCAGCACTTCCTACCCACGCAGAAAGGCTCCACAAGAGGAGGCA 

CACCGCTCTGACTGTCCCTAAGCTCCTGACATCTTCACCCCATGAAACT 

GCTGCTCCTGGGTGCTTCCTGCCTTGCCCTGCCCACCCTTGTACTGTTCT 

CACCATTGACACAGCTGGTGCCCGATGCAC 

NAAAACGAATCGTCACTATTGAAGCCTGTCTCTCANCGGATCGTGACTAA 




TACATCTTAAATATAriUAi rt»Ai.v*AV. l i av_u iv. j. ^ - ■ 
AACCAAGCTGTGCTCTTACCAACTTGGGCACATGTGGTCAAGACCTCCTG 
ATGCT^^GTCATCAGTGGGTGGGTGTTCrCAACCTTGGAAAAATAAACT 
"Tr^AAAT^AACTGAGACCTGGGTCAGATTTTTGGGGTTCACAGCAACAA 




ATGCCACACTTCCAACATGTGTCCCCATCCACCATCTGTCTTCTTATTGC 
TGCATCC-ACCCAGGCCCTGATCTCTGGACCCATTGTTGTATAATTAAGA 
^TT-GGGGCTGGGC^TCGTGGCTGTGGCTCACTCCTGTGATCTCAACATT 
TTGGGAAGGTGTATTAGTCAGGATTCCTCCGAAGGATGCAACCCTAGGGA 

TCCTCTCTATGACCCTATGTCTA 

CGCGCT C AAC CGACCGATTTGCGCGAAC CTGCCCATGC CC G AGGACAGTG 
TAATCCTAAAACGTCCCCtGAATCATAAGGATATGAGTGCGAAAGTACGG 
TTCCCTCTGTCACCACrTTCTAACAACGCTATGTCCGATCCGTGCACTAA 

ccccgcccaagtcactgaaacactgatgggcgcttcctctacaggtatcc 

» -» r^ai^r^rrrrcTrrrrGTCCCCCTTCCACTCTCTAG 



GTCATCCCAACCCCCGTCCCATCCTCCATAi il,ci-i-i-<-v-<-v. 
GGCC--ATCATCATGACCTCCAAAATTCTGTCATCTCCCGACGTAATGCC 
3CC GAACGCCTGACACCATCAAGTCNGTCACCTCCCAAAATACTCC 

-cctaatcacSggc^^ 
gggccgatatcacacac 

NGG^GTTTAGGTCAACTAGTAACAAGTGGGATTTGCGACTCAGG 
TAATC^CAAACCCACGTCCTGGACCCCTAO^CAGACTGCCCTCCCTCAG 

TPCAAAGAATCTATGGACCCAGT^TCTCTGGAGTCaATGTTCTGAGGCT 

Sg^S£caggc^^ 

.^^^^^^x^Rr^ir^j^ar.aTrAT&AACAGAATTCTGCCCTG 
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CTACAGGTACGCACCACTACACCCTGCTAATTTTT^GTTT^TGTATTTT 

TTGTACAGATGGGGTTTCTTCATGTTACCCANGCTGGTC^TG^C C^TS 
SGCTCAAGCAATCTGCCTACTTCAGCCTCC^^SS^i^IS 




>Concig5 




AACGTGAGATATNTTC 
>Concia7 



a ^I^-:a-*^ otcctatc ^gctgacatcaaagggataagttcc 

AGTT AC C "AGCTGAAGGGAAGGAGGGTGTTTCAGATAGAGGAAGGATAAG 
CATGACCTATTCAAGGCcar.Tr.a* ^ ^i—I?: ;=££?v52**C*? 




^II^5P^^^^ ATA TAATATTCATTCAAGCATGCAGTAAAA 
GGTCACTGG<^CCTGCCATGGGCCAGGACTCGG<3CTCTAGVTG^TTGCGT 

I^^^^SZZ^^^^ ^^^^^^A^^AGTAAGCTGTTTTTGTG 



GGGG 
>Concia8 



^I^IH-l- TTO ^ CGCTGTGTA ^' lmAG GTGACGGGTGGTGACAA 
TGAGTCCGTGTCGAGCGCTGATTTTTTCGGCCTTTAGAGCGAGATTTATA 
CflATAGAAlTTGGCATGAGATTGGATTGCTTTTAGTCAGCCTCTTATAGC 
CTA^GTCTTTGAGTCSACTAGATGACATATCATGTAAGTTGCTGATAGGT 
rj^ G A^ TCCGCTCCTAGGTCTG ^ATTGTACTTScC^IcTCG 
ACTTAACCAGTACCAACCCAGCTTCTCAACGGATTTATACCATGGCACTT 
TAAAGCCAGCATCACTGACAATGAGCGGTGTGGTGTTACTCGGTAGAATG 
CTCGCAAGGTCGGCTAAAATTGGTCATCAGOT 

uAAAACGGGAACGCTTTCTCATAAAGAGTAACAGAACGACCGTGTAGTGC 
GAATGAAGCTCGCCATACCATAAGTCGTTTTTGCTCCCGAATATCAGACC 

agtcaac^gtgtcaatckmctcgtattgcccgaacagattaagctagS 

TGCCAACGGGATAAACGAGTCGCTCTrGGTGGAGGG 
>Contig9 

GGGGTGGGGCGCCTGGTGTTTCTAAAGAGGATCTCCTGCCAGAAATGGTG 




ATCCTAGC^.^TGTTCCACCCAAAGCTGGCCAGTCTCCAGGCCCTGCCTG 
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AGCCCCAAGGAAGTGGTATGGTGAAACAGAAGGGCCATTCCTGTCCAATG 

T GTGAGGAACTTCATTT CAGACTTGTTGGAAGCC CTGATGTTCAAAAACC 

7 CAATG ATAT CATT CATTTTC C C CAT CCATTCAATGC C CATCCAATGCC C 

ATCCGTTCAATGCCCCTTCCATTCCTCTTCAGGGAAATGAAAATTGTTCA 

GAAATCCTTTCTCTTTCGAGAAACCAACCAAACCAAAACCGCGAAATTCA 

CTAAACTAGCCAAGACACAATCCTGGGTTATTTTCCTTTTCCCAAACCTC 

CTCTGTTTAAATTAATTCTACCCTGGTTCTCGGCCCTTACTGCGAAGGTG 

AACTCACCTAACCTCTCCCAAACAGAGAAGAAACTTCTCTTGGTAAAATG 

GGTT7TAACACTTCTAAAAAACCCCC 

>Conr ialO 

GCTATGGTTCTAAAGGTAATGGACTATGGCGTACACAACGTCTCGCTCAT 

CGTCTG C CAGGAGGCTAAGGTAT CCACGGACAATCGCTG AGCAACAGTGT 

C GTTGAT CCAT CTCTGTACGCACTTGTCAACATGGCAGGAGTACGGGAGC 

TGCGAGAATCCTCTCTGCTGATGTCCCACGGAGCATGCCGTGAGACAACG 

C CAC GAACGGC C CTC GGAGANANCTACTCTGCAATGAAGACGTAC GATAC 

ACACGTAGGAGTCCTAGCTCACCAGCCGTATCTAGGTATACTGTACTCGC 

GGATACTCACTCGTGCATGCGGCAATAGATCGATACGCAGTCGTCACGCC 

CATGCTCTCAGTGTGTGACCTTCTGGCGGTAGCGTNGTGGGCGCTATTAC 

TGTGCGCAGCAGGCGCNTCGTACATGTGTCGGGTAGCGATGCCAGGAGCT 

GTAACATAGCAAGTCGCCCCCCTACTCCTATCACTATCCCTACGCTGGAC 

CGCACTCGAGATCTGAACGCACGTCTTAACCTGCCAGTACTCGTGAGACC 

TATACTGCGCAAGCCTTGGCTAGGAGATCCTGCAGCGCCGGCAAAGAATC 

AGCTATGAT CC C CTTGCGATTATCGCACACGCACCATAGAGTATGTGCAT 

ATTAACCTCTGAATGTGCTGCAAGCAGACGGTTGCTCAACATATATATGG 

ATGTGGGGAAATCGCCCTGGTCACCGCCACTTGGCGTCAGGAGGCACCAG 

C ACGT CT GAGTGTCAC GCAC GTT ACTC 

>Concigll 

GGCCGAATGGTGAATTCATCCGTCGTCTCGAGGGGGTGAAAGACGGGGAG 
TTATGCTGTAATGGCACCGCTCACCCTGGGCTTATGAGCAGACCTAACCC 
TC C CANAGTGCTGGGATTACAGGCATGAGCCAC CGTGC CCGGCC CA GTAT 
CTGAACTTCTGTGGCCAGGCAGAAAAGGTCCTGTGTTACTCGTCTCCTTT 
ATCATTCATGTCCATATTCTCCCATTTGCTAACATTTATGTTTCTGCTCC 
ACTGGATTCTTTGGATTTTTCTAGAACATACCCATGCTTTGCATTGCCTT 
GGTCTTTGAATATTTGGTCGACTTTTCCTGGAAAGTCCCCTCTCACCTTA 
TCTT CCTGGT AAACTTCCAGC CAAC AC CTCTTTACTAACCAGAGAAACAT 
GGTTCAACTGTGCACAGGCTTGCACAGAAACTGTTCTCATATTGTCTTGT 
CATT GT CAAT GTGGCAGAGATGCAC CTTAGAT AC CTCTTTGAGAAAGGAC 
TCACTGCCCAGCTGCCTGGCACGTGATGAGCTGATAGCTCCA GCTA TAGA 
C^CC^TAGGGTCAACCTCTGCTTTCCAGTTGAGATCATATCCTTTGCAG 
GGTGGCCTCCCCAGTGATGACTAAGGCAGTGTTACAATGGCCTAGTCATT 
TCCTCCCAATGCTGGACTCCCAATGAACCATCTGCTCCGGAGCTTCCCAC 
TGGGCAGTCAGAGACCTTAGCTAGTCTGCCTCCGAATCAGAAGGCTCTCT 

CTTGCCACTCTGGCC 
>Concial2 

GCTGTGTCTAAAGATTCACGGCTGTAGTTCCAACTCCCGCCGCCCTCTAC 

TGTGTCCTCTTAATGGCAGTCATTCACCATCTTCCTGTCCCTCCCCTTCA 

TTTCTTGGATGGTGACTGTCACTTTGCTGCAACAGAACCCTGTCCCAATC 

^TTGATGGTTCAATACACAGATAGACATTCTTTTTAACAGGGCGGCCTCT 

CAGGTCTTTAATTTTCTTCCCTCCAATAACCTTGTGATGATCCCCC^GCT 

TAGCCACTTACTGCCAGATCATTACCAGTAACTCCAGCCCCTCCTTAATT 

CTAGTTTCTAATATCCTAATCTGTGACCTCACATTCCAACTTCTTCATTC 

T TATCCCCTGAGTCAAAAAATCCTTTGATCCATGCAATCCATTAAGTCAT 

CTACCTTTTCAC CATTCTTCGCCC CACTAGGGTTCTCATTC CTTTATTAC 

CCATATGAAATTCCAAGGCCTGTTGGAATCACTCCCTTGCAGCCACTGTC 

AATAC^TCTGCCCCTTTTACTTCATCACCCTTATGTGGCAAAACCACAGC 

CCTGGTGGAGTCGATCCTTACCCCTGCTCTGTGCCAACAGCCGCACACGC 

ATGGC^GATGGAGGTTGGAAAAATCCACACATGCAGTGGGCCCTGTATGT 

CCATATACGTATCCAACCTCCAGCCTTGCATATGCCTCAGTGCTGCCTGA 

CAACACATTATATGTTTTCCTTAGTTCCTTCAGTCTCCTGGGTGCCTAGG 

TGAGTATCTCAGACATCCTTCTCTCTCTGCAAAGCTCCAACACCTCCACG 
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.^ACATTCAACTGATGACrGTGTCTCCTATGTCACTTAGATCACAaAGGC 
ATACATAAACAAATCCCAGCCACTGCCAGCACTC7GCACATC-GCGAGCA 
TGGCACCCCCAATCTAGGCCTTTCCTGCTGTCACTTGGGGTGAGCTGAT'- 
ATACTCGATCCTAGTCATTTCTACTTATGCAC " 

>Concigl3 

A * XT - GTTTTvjGC x GCGCC7ATAGTAAAGTAACCCCTATGNCAAATTTTG 
ACAC i TATAGTATTTGACAGGGATAAGTATAAAATTGCT7GATTGATAC 
ATCCACACCCAAATGTATGCTGGGAATGATTTTGTTTCACGGCACTCATT 




miwnu^nyy, iALtfaAUAAtJATAACAATACTC ■ T' r 

CTTTTCTCTGTAATAACATGCCTTACTCACCTTTACAATAGGTTTGTTGT 
GAGAATCTTGTAATGTAAACCCTGGGTGTTCTGTGAAGCATTTTTAAACT 
rCTAGTTTACACTGACTCTTATTCAAGTGTTTTTAAAAATATATTTAAAA 




- wx^vjo^vjou^uft i. u\u\AUUT w AGGAGTTTGAGAC CAGCCTAGCCAAC 
ATAGTAAAACCTCGTCTCTACTAAAAATACAAAAATTAGCTGGGCGTGGT 
GGCGGGCGCCTGTAGTCCCAGCTACTCAGGAGGCTGAGGCAGAAGAATCG 
CTTGAACCCGGGAGGCAGAGGTTGTGGTGAACCAAGTTTGCGCCAATGCA 
CTGCCAGCCTCTGCAGNGACAGCC 
>Contial4 



iu«, 4 n«™i iiu*iliuli laai GGTGTACACITTTCCATACATTAC 
ATACTCAAACTGGTCAGCCCTATGGAGCAATAGCAGCAAAGTTATTCTTA 

acagtaattaacaatataaaagatcccatttaaaaatggttactggtcag 
ccgggcgtggtnnirrcnancctntaaccccancactttggaaagcatgcg 
ggcgatcccaagtctgatatcgaaacatctgcctaacatgtgcaacccct 

CTCTACAAAATACAAAaaaTaTrrrvv^r-~rr!"rr!'T«rr>r.^^^o^T~-» 




atatccgctctcttcaccg^gcgcgaccaaattctttagtataggaaag 

tttattgttttatgcctttgtcaaggctctactgtatcttttctgtccac 
tcac 

=>Contigl5 

ggt7ctgaacaacagcaggcgattcctagccctgtacccggggcattgtc 
caacactcgacagggctgaattcgtccataacggtgtgcccctctgggat 
ataggatgaaatgaattgatctgagtacctgggatgtaaagttactaaaa 
C GCCAG CTAGGTTCACGC cccgatgcttaaatatgatcgtggcct acac c 
TCGTCCAGCAGAAAAAGTACCCTTTCTTCAACACCACCTCACGATCCTCC 
AATTTAGGAGCTATAAAACTCATGACTCTTTATTTAC C CCCTGCAGATT C 
TCAATCCAATAGTGTGTGTCTCCCTGTGAACTCACGGATATACCGATTTT 

CCCCACGTCATTTCCACACGTCGCAATCGCTTAGTCATCCCTATGTATGA 
GAATCATGGATGA.OTATf3TTr:a AftTrra<rf"rii t* » k^tt*?-* % *>r*~r*~* «~ 




* *>-y>y_ i.«ni 4 i^^^^ALiAl l^LL-AALGCGACTCGACATATT 

GTTATCGCCCAAGGGACCTCTTGCATCCCCCATATCCACTGGTCACCTCC 
CCTCTTGGCTGGAAGTCACCGGGAAGTTCTCCACATGTTGT 
>Contigl6 

TGCGAGCGATGTTCCTAAACTTTAGCGCCATTGACTCGAGCATGGTCATG 
GCTGTTTCCTG 

>Contigl7 

AGGGTGTTCCTAAAGaATACTACGTTCCCTAAAGTCCAGAGAAAAAAAAA 
AAAGTAACATAATGTGGCTTATTTGGTATAAAAATTTTACAGGAAGCATT 
GTCAAATATGAAATAGTGTTTGGTTTTGTTTGGGCTGTATTTGTATAAAT 
ATGTTATTGGTATGTGTTCCAAAATTATAGGAAAC7CCTATAATTCTGAT 
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ATGACTTGGTGTACATTATCAGTAATAATTATAATTGTTATGGTAAATTA 

"TGTGTGCCATGGAGGTAACAAATTTCCTCATCAAGTGTGTCTTTGACTA 

TGGTTGC CC7AAAACTTTTTGCCATTCACAGACAATTGTCTTGCTTTGGT 

CCTCTTTAGAAGGTGGTTTTATAATCAGCTATAAAACTCTAACGGGTGCT 

CTTGAATGCAGGCTTAAGATAGCTT7GGAGACTGTGACATCAGAATAGAG 

GAAAAACTTT CAGTATTCATGGAGTGCTGAAATATTCATGAATAT CAAGC 

AAAACAGGAATTAACTTCATAGATGGAACTAAAAGAATGCTGAAGTAATC 

TTTT^GACTTTTTTTCTTAAAATGTTGATCCTTCGTTTTGTTTTTCAGAG 

^CAAGGAAATTTTTCTGTTGAGATATTGACAGCTTTTAACAATTAAGTAT 

ACTCCAGTGAACACAATTTGGAGCATATTTGTGTCTCTCTATATATATTT 

GGAAACAATNTTTGAGTATTCTTAACTTATTGCAATATT 

>Concigl8 

GGTTGTCTGCTATACCAGTAATGGGATTGCTGGGTCAAATGGTATTTCTG 
GTTCCAGATCCTTGAGGAATTGCCACACTGTCTTCCACAATGGTTGAACT 
AACTGACACTCCCACCAACAGTGTAAAAGCATTCCTATTTCTCCACATCC 
TCTCCAGCATCTGTTGTTTCCTGACTTTTTAATAATCGCCATTCTAACTG 
G C ATGAGATGGT AT CT CATTGTGGTTT CAATTTGCATTTCTCTAATGACC 
AGTGATGATGAGCTTTTTTTCATGTTTGTTGGCCACATAAATGTCTTCTT 
CTGAGATGTGTCTGTTCATATCTTTTGCCCACTTTTTGATGGGTTTTTTT 
TTC^^GCAAATTTGTTTAAATTCCTTGTAGATTCTGGATATTAGCCCTTT 
GTCAGATGGATAGATTGAAAAAATTTTCTCCTATTCTGTAGGTTGCCTGT 
^ CACTCTG AGAATAGTTTCTTTTGCTGTGCAGAAGCTTTT CAGTTTAATT 
AGA^CCCATTTGTCAATTGGCTTTTGTTGCAATTGCTTTTGGTGTTCTAA 
TCATGAAGTCTTTGCTCATGCCTATGTCCTGAATGGTATTGCCTAGGTTT 
TCTTCTATGGTTTTTATGGTTTTAGGTCTTATGTTTAAATCCTTCrrTTT 
TTTTTTTTTTTTTTTTTGAGATGGAGTCTTAGTCTGTTGCCCAGGCTGGA 

GAGCGAGTGGCGTGTCTNTAGGACGC 

GCATGTTGTCTAAAGGTTTGTCTTCCTCCAAAATTCATATGTTAAAACCT 
AGCCCCAAATGTGATAATATTTGGAGGAAGGCTCTTTGGGAGGCAGAGCC 
CTCATGAATGGGATTAGTAGCCTTATAAAAGAGACCCCTGAGGGCTCCCT 
TGTCCCCTCCACCGTGTAAGGATGCAACAAGAAAGTATGGTCTATGATCC 
AAAAAGCAGACCCTTGCCAGGTACCCAATATGCTGGCACTTGAACTTCCC 
AGC CT CCAGAACTGTGAGAAATAAATTT CTATTTTTCATAAGCCAC CGAG 
^ C^AT GGTATTTTGTTATAGGAGCACAAACAGACTGATGTGC CAC C CAAC 
CATGATTATACGTGTAATTTATGGTTTCTCTGCTAGTAGGGATGCACCAT 
GGGGTTAGGAACCACGCTTTTCTTATTTCCCACACAGTCCTTAGCTCTAA 
GCATGTT Z CTGAATCAAAGAT C C C CATCTTTTATGAATGAAGGAGTCAGT 
3 AATG AATT AATG AAAGAACTG ATAAC C CT CAATAATT ATT C CAGC CTTT 
T ATAC CT ACTATTAA 

ACGG^CTCTAAAGACTTTCAAGAGCTGGATTTTATGCTTTAGGTG 
TGATAAAGTAAAGTGCTTTCACTGTGGAGGGGGGCTAACTGATTGGAAGC 
C CAG C G AAGAC CCTTGGGAACAACATGATAAATGGCATCCAGGGTGTAAA 
TATCTGTTAGAACAGAAGACACGAAAATATATAAACAATATTCATTTATC 
CCATTCACTTGAGGAGTGTCTGGTAAGAACTGCTGAAAAAACGCCATCAC 
n * T~r*n * t% & TrrrrraTa a Tr rTATdGT ACAAGAAGCT 




ATATGAATGGuoTT wvj lii uaaaviawi x i AAvjAAnA^nn* — — - ■ 
AATTCAGACATCTGGGAGCAACTGTAAATCACTTGAGGTTCTGATTGCAG 
ATCCAGTGAAGGCTCAGAAAGACAGTACACAAGACGAATCAAGTCAGACT 
^ CATTGCAGAAAGAGATTAGTACTGAAGAGCAGCTAAGACAC CTGCAAGA 
OTAGAAGCTTTGCAAAATCTGTATGGATAGAAATATTGCTGTCGTTTTTA 
TTCCTTGTGGACATCCAGTCACTCGTAAACAATGTGCTGAAGTGGTTGAC 
AAATGTCTCAAGTGGTACGCAGTCATTACTTTCAAGCAAAAAAATTTTAT 
GT CTT AAT CTAACG CTATAGTAGGCATATTATGTTCGTATTAT CCTGATT 
GAATGTGTGATGTGAACTGACTTTAAGTAATCAGGATTGAATTCCATTAG 
CATTTGGTAC CAAGTAGGAAAAAAAAATGTAAAGCCAGTGCTTAGACACA 

>CO*TC * 32 ^ 

C 3 CT G T CTT AAGAAC TGGG CT AGG AGTG AG CAGTG AGC CAAGATC G CAC C 
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ATTGCACTTCAGCCTGGGCAACAAGAGCAAAACTCCATCTCAAAAAAATA 

CATATATATATATGACCCATAAAAAGGAGATAAATCAACACTTCAGAACT 

3ACCCAAACTTGCAAAGATACTATAATTAACAGAAAAGGACAGTTTACTA 

AGT ACT C CGTATGTTCAACAAGTGAAAGATTAAACATATTAAGTAGAGAT 

GTAGAAGATATAAGAAGATCCAAAATGAACTTTTAGAGTTGAAAACTACA 

ATATTTAAGATAAAAATACACTAGGTGGGATTAAAAGTAGATTACACATT 

GO^TAAGATAAAAAAAAATGAGCCTGAATACAGCACAGTATAAACTATCT 

TAAACAAAAACACAGAGAGAAAAAATAAC TTTAGAGACTTAGCT CTTATC 

CTCTATTTGTTTCTAAACAGAGGATAAGGGGCAGAAAAAATGTTTGAAGA 

AATCATGATTTTTAAATTTCCAACTGAGATAGGAATAGCACTGGGTAGTC 

ACAGGAGGCTGGAAAGACCCAAACAGCAGTTAAAACAGGAACTAGGCAAA 

GAAACCAAAGGATAACAGTAAACCTAAACTAAGGGAGAGAAAACTGACAA 

AAGCTGACTTAGGATAACTGAC 

>contig22 

CCTGAATATAAGCCGCAAGTAACCAATTAAATTTGTTTTCCAAAATTGTA 

TTAACAATCTATGAAATTTTTATCTTGACCATAGCTATAACTTCCAGAAG 

CCTTTTATAACCTCTATAACCTTTATTAAGGAGTAGGTTAATGCTTCAAG 

AAAAC CTTGTTAATCTGACACAGGACCCATATGCTGATCTTGCATCAGTG 

TGGCTTGGACATCAATGATTATGATTAATTTATAGAGAAATTGAACTTAT 

TTTATCTCTCAAAATTGGCCCTTACAATCTCACACACCCACCTCTTCCAC 

TATAGTTCCTGGGCCTTGAGTTGAATAGCTTTAATTTCTGGCTCTSTGTT 

TCAAGAATGCAGTTTATTTTGATTGGCATTTTCTACCAGTCCTGAAGATG 

AACCTTTAATTGCTGTGAGTATTTAAGATTTAGCAGGACTTGTCCTTTTA 

AGAACCAGGAGTCAAGCCCTATAACTCAATGTCAGU^GGACTTTAAAAGC 

ACAT ACATAAAGATATATGGATGTAAT AAT CATAATTTTTAAAAAATTGT 

ATTAATCTCAGTGTTTTCTAAGCAAACCAAAACTTAATAATAATGGCATA 

GAAATTATTTCAATAAAACATAAAATCTGTTAAGCCAGTTACCAAAAGGC 

AAAAGAAAAGACCTTCTGCAATGCAGAGAATATTATGTTGGAAGAAAACA 

TTTCCTTTAGAC CTTTAAGAAAACATTGTTAGCATCAGGACACAACAAAC 

AGAATCTGAGGGTAAAAAACGTATATGAGCTGAAGGGAGTTGAAGGAGGG 

CATTACTATTTCCCACCCTTTTAAAGGGGAGAGAAAACCTAAAACAGCAA 

GATGCAATAAAAGCTGAACTTTGGGTTAAAAAAAAATT CTTAAGTCTCTT 

ATAATTTATTAAGAGTGAATCAACCCCGTAAGAAAATTTCATTGTTCTAA 

CCAATTTTTTAATATATAAGTAGTTTTTTAACATCAACCCAATCTCTAGA 

AAGACCATTATAATTTCCCTTTAATTATAGACAACTTTATCATATAAAAG 

TTTTTTTAAATAAATCCTCTTATTGTGACTTACACAGACTATTCATGACA 

TGCTTGGACTTTCTGGTTTGTCGTGAACATCCTTTTCTTTCTTTCTTTCT 

TTTTTAAATTTTACTTTACGTTCTGGGATACATGTGAAGAACATGGAGGT 

TTATTACGTAGGTGTACATGTGCCATGGTGGTTTGCTGCACCCATTAACC 

CGTCATCTATATTAGGTATTTTTCCTAATGTTATCCCTCCCCTTGCCCCC 

CACCTCCTGACAGGCCCTGGTGTGGGACATCCCCTCCCTGTGTCCATGTG 

TTCTCAATGTTCACTCCCACTTATGATTGAGAACTGCAGTGTTTGGTTTT 

CTGTTC 

>Contig23 

GCTAAATATAAGCTATGATAAAACAGTTGGCCCTCTGTATCATGGGTTTC 
ACAACTGTGGATTCAACTAACTGTGGATGAAAAATACTTGGGAAAAAAAG 
AATGGCTGCATCTGTACTGCACAAGTGCGTGCTTTTATTCTCGTCATTAT 
TCCCTAAGCAATACAATATAACAACTATTTATATAGCATTTACGCTGTAT 
TAGGTATTATAAGTAATCTAGAGATGATTTGAAGTATACAGGAGGATGTG 
CTTAGGTTACATGCAAATATTATGCCACTTTATATAAGGCCCTTGAGCCT 
CCTCAGATTTTGGTATCCATGGCAGTCCTGGAGTCAATTCTCCTGCAACA 
TCTCCATTTG TTCA GATTCTCTTCTATATC ATGT TTATA TCAGAA AATCT 
ACATAAGATTTTTTAATGTGTTCATATAGGTTTTGTGTATTTTTGGTTGT 
TAATCCCTAGATATATGCAGTATTTATTGCTATTATGAGTAGTGTTT C TT 
TACCATGTATTCTAGTTGGTTATTGCTGACAGAGAAATGTTGCTGGTGTT 
TCTAAGTTACCTTGTTTCTAACAACCTTGCTGAACTCTTATTAGTTCTCA 
TAGTTTTTAATTAATCTTTCTTAGTTCTGATAACATAATCTGCAAATAAT 
GACAATTTTATATCTTTCTTTCCAATGCTTATATCTCTCAGTCCTCTTTA 
TCCCAAAGTATTTTCCAGGATCTCCACTATAACATTAAATAGTAATAAGA 
ATTTCTGTCTTGTTACTGATCTTAAGGAGAATAAATTTAAATTTCCTCTG 
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'CAGGTITTATGCTrGATATAGATITGTGATATATAGCCTrrCACAGGTT 
AAAAAAAAATGCTTTCCTAGTAGTCCTAATTTTTTAAAAAAATCATCATA 
AATAGATGTTGAACATTATCAAATGCTTTTTCTGCATCTATAGAGATAAT 
CATATGGTTTTrTACTATTTATTAATGTAATGAATTAGACCAATTTTCTA 
ATGCCAACTCT7TCTTGTATTTGTAGGGTAAATCCTATGGGATCATAAAA 
^AC^TTTAATACATTGTTAGATTTGAAGAGTTAACGCCTTATTTAGAACG 
ttt-CAGTCACATCCATAAGTGAAATGGCACTATAGTGTCTATTACTATT 
ATA~TTTTCTGGTTCTGAAACCAAAATTATACTCACCTCATACAGTAAGT 
TGGGCAACTTTTGTTCriTrTTTCTGAAACAATTTGTGTATAGAAGAAAT 
TAACTGTTCCTTGAAAGTTTGATAATAATCATCCAGAAAATTATCCCCAT 
CTAGGGCTTTTACAAAAAGGAGACTCTAGAATGCCATTTCGGTTTCCTTG 
ATGTGTATTGGCCTCTTTCATTTAGGCTTTTGGATTTTTTAGGGCATTTT 
TTCACTATAGGCTTTTTACCGG 
>Contig24 




3AGAGGGCTCGAGGACTGGGCAGAt»AA<_ i aui i a **» a v. a ww» 

3AAACCAAGCGGAAGGCGGGGAGTAAAATTGGGAGGGAGCGACGGCCTT 
uaAAGAAGGGGCTTGCATTAGATCGGCGAGATCCGGGAGGGTCTGGTGGG 

gagaaatgactagaggacaaatctaatggagagacagacggagatagata 
tcgtgacagagagagggacagtgacagcgcacaacagtgcagggtccatg 

„ anr:rrrTTaaf3Tfrrft.CA.CCC:CAGCCGGAGTCATGGCAATTCGAT 




^ggtccccaacaagactgcttcgacctcagatgcaaagcacacttcaggg 

GTCCCCAAGCCACTCATGTTrTTTGAATGACTGCCATAAGTTCAAAAATT 
CCCACAATTCTCTCAGATTCAATAACTGGGTATAACCACTCATAGAACTC 
AAGAAAATGCTATCATTATTATTACAATTTTATTATAAAGGATACAAATC 
aGAAGGACTAGCCAAATGAGGAGACACATAGAGAGAGGACTAGTAAAAAA 
CAGAGCrTCTGCGTCCTACCTTCAAGGAATCAGGATGCACCACCCTCCCA 

gcacatcaagtgctcatcaaccaggaagttcctctgagctccaatgtcca 

GAGATTTTAGGGAGGATTCATTACATAGGTATCATTGATTAAATCATTGG 
CCATGTACTTGAACTCAATCTCCAGTGTCCCTCTTCTCCCTAGAGGTCTG 
AAGGGTTGG CTAATAT CATGTGGCTCAAAGC C C CAACTCT AATT AC CTTT 
TTGGTCTTTTCAGGGACTAGACCCCATCCTGAAGCTATCTACAGGCCCTG 
C CATGAGTT AGCTCATTAACATAACAAAGACACTTAT ATTACTCAGAAAA 
TTCCAACAGTTTTAGAAGCTCCATGTCAGGAACCTGGGACATAGATCAAA 

TT crrrri 1 1 1 1 1 r i ttit t ttggagacagggtcttgctgtgttgcccag 

GCTAGAGTGCAACGACAGATCACAGCTCAATGCAGCTTCAACTTCCCAGG 
^AAGTGACCTTTCCACCTrAACCTTCCA AGTATCT GGGACCACAGAAA 

ATGGCTAATTATCCTGGCTGATTTTTAAAClT'r'i 1 I 1 1 "TGTAGGGATG 
GGATCGCCCTGTGTTGCCAAGGTTGGTCTCAAACTCCTGGGTTCAAGCAA 
^CATrcTGCCCTGGCCTCTGTGATGGTTAATACTGAGTGTCAACTTGATT 
GGATTGAAGGATACAAAATAATATTTTTGGGTGTGTCTGTGAAGGTTTCG 
CCAAAAGACATrACTrTGAGTCAGTGGACGGGGAAATCCCCCC^CCCCA 
TGGGACGGGGAGACCCCCCTCCATCCAGGTAAAAAAATCTAATCACCTGC 
AATGTGGCAGAAATAAAGGAGGGAAAAAACGGGGACCCCTANATGGGTTA 

TTCTCCACCTAATTCTTCCCCCAGG 

CCATm^CATTTCTACAGACCCTGAGATGAATTTGTCATTGCCACGG 
GGTCCTGAAGTTCAAATACTCTATTTGGTATCCTGCCCCTGTGGTTAACT 
GTGATCATTTCACTCACCrTGTTTATGATGAGAGGTGCCACCATCTGG^C 
TCCTCCACTCTGCAATCCTGTTAATTCCTATCAAAGCTGAAAACCTGCTG 
CAGCACCCACACCATCACCTCCAGCCTAGAGAGGGAAGCTACCAGTGAGC 
TCTCCTGGATGCCGGTGTGCCCCTCGCCAATACATTTCTrCTTAGTCCCT 
TGGTCATCCTGAGGTGTGTGATTAATGGACAGCTATGTGGATTGCACATA 
ATAGATCTACTCCAGCATCTTCATCCCTGATTTTCCTTTACAGAAATCAC 
-CAACCTTAGCAACATGTGAAAATCACCTAAGGACATTCTTTAAATCCCT 
CTGTCCACATGGCAACACAAACCACTTAAATAAGAATCTCCAGGGAGTCA 
CTCAAGCATCAATGTTTTTTAAAGCTCCAATTTTAAGGATCACTACATTA 
TG^AAGAAATTATAGTATTTCAGCCrrACTGACTGTAAACCAC^CCA 
ATCTAAGCATCCATTAGTCAACCTAGCAGACAATAAACTAACATTACCT 
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ZCAGGTACTCAAATCAATTCATTGCATCCCAAATCCCAGATGGGCCCACC 

CTTA7TGACAAATTCAGCCCAATCTTGGTTGAACACATTTAGAATATATT 

rCCATGAACAATATCCGGTTGACGAGTTTCTTTAACTTTTTGGAGTTTAA- 

GCCATTTCCTTTCACAGTAGCCTTGTTAATTCCCTGTCAATGCTCCATGG 

GGGTCATGAAGAGACCTCTTATTAACTGTGAAGCAACTTGGCTCAGGTGC 

AGACACT CAAATG CTT CACATGCAGTGGGAAAAGAGAGTGATTGTCTAC 

>Contig26 

TTTAAAAAGAACTGAGTCTTTATTCAGTCGATTCTTCTAATCTATGAACA 

TAG CAT CT CT CTCAAAGCATTTAGTCCTTCTTTAATTTCTGTCATTAATT 

TTTTAAAATTTTCATC CTAAAGATTCTGTATATGTTTTGTTGAATTTATG 

CTTAAGCATTTCACTTTCTTGGTAACAATTATAAATGATTTTGTGTTTTT 

TATTCCACTAGTTCATTTTCAGTGTGTAGAAAAGCAATGAATTTTTGTGT 

GTTGATCTTTGTTCCAACATCTTGCAACATTATTGAACTCATTTATTAGT 

TCTAGGAGGTTITTTCATTTTTCTTGTAGATACCTTGAGATTTTCTATAT 

AGACAGTCATGTTOTCTGCAAACAGGCACAGTTTTATTTCTTCCTTTTCA 

ATCTATATGCCTTTTTTTTTTTT'ITIGCCTTATTGCAGTGGOTAGAACTT 

CTAGCACTATGTCAAATAGCATTGGTGAAAGCAGACATCCTTGTTCCTTG 

T C TT AG AGG AA CATTTGGTCTTTAAT CTTG ATTTAAAAAATT C CTTGCAC 

TAAGTTACCGTGTTTTGCGGGAGGGAGAGGTGGGGTGAGGTGGGGATTTC 

C CCTAATGTTTACAAGCTGGGATTTTCTTTTTCCTGTGTCTAATTATTTT 

C CTCATTGGCTTGAAAAATCTGATAAAACATTTTAGGACTGTGTATAAAA 

TAGAATTAGC CAAGTG CAATGTCTTTATTCAGAAGAAATTTCATGGACGT 

TGTGCCTACTCTCTTGGCTTCCTGGCTTCATGGCTTTCCAGATCCCACAG 

T AAGCT CTGGATAGTAGAAGTTATAGTAAGACTGACTTCTAAATAAATGA 

AGTGACTTTAACCTTACTGATATGGCTTAAAGAAAAGGAGTGGCCTTTAA 

GATCCATGAACTTCTCAAACAAAAGTGATAACGTTATCTCCATGCATATA 

TAATACTAAATATAATGCAACTGAGAGAAGTAGGCTGTGGTAAGAAAGGA 

GACCCAAGTGCCATCTGAAGGCAGCACTTACCACTCTGCTTCATCCCACC 

GAGGAAACAAAGCATGAGTATTGCCAGATTTTCTTCTGTTTCAAGAAAAG 

CCAGAAATC CAGG TTTTTGCGTGAAATGTCCTGATTT-TAATGTTGGGAAC 

TAATTTATATTTTGAAATAACATTGTGTGGGACAAGTGAACTTGTATGTG 

GAACTGCTTTCTCCCAGTGGCGACCAGTTTGGACCG7TGATACTCAGCAA 

GTTCAGCCAAGTGCG CCTT GTCATTGTCAGTCATCAAGGTGATGTGTGAT 

TGGTCAAGCAATTAATTTTGCTCAGCATCTCGTGTGTTTTCAAAAGAACT 

GAAGGTTCATTTGC 

>Contig27 

TTTCAGAGCACAATGCGTATTCATAGTATATTGACTTAATTTCTAAGTGT 
AAGTGAATTAATCATCTGAATTTTTTATTTTCAGATAGGC7TAACAAATA 
GAACATTCTGTATATAAATGTGTAAATTAGAGTTAATCTTTCCAATCACA 
TAATTCGTTTTATGTGAAAAAGGAATGAACTGTTCCATGCTGGTGGAAAG 
ATAGAGATTATTTTTAGAGGTTTGTCGTTGTGTTTTGGGATTCTGTTTTC 
TTTTAAAATTGTAAATATGTACTTGTGTGAATGATTTTTTAAAATGATTT 
TACCATTTTTGGAAGGGTATTTAATGATAGAATATCATCGAGCCAACATG 
CACTGACATAGAAAGATGTCAAAGATATATTAAGTGTAAAATGCAAGAGG 
GAAAACACTATGTACAGTCTGAGCCAAATCAAAGCATGTATGTTTTTTAT 
ATGTGTACAACAAAAGGTTTGGAAAGATATGCGCCGAATTGTTAAATGTG 
GTTTCACTTGAGGGGGTGGGAGGATGGGGCCCCAGAGGGGTTTTTATGGG 
GGCCTTTCACTTGGTATTTTTTTCATTTTGTTCTGTTTGAAATTTTGTTT 
TTTCTTTTTAAATGGAGTTTCACTCTTGTCGCCTAGGCTGCAATG7AGTG 
GCGTGAACTCAGCTCACTGCAACCTCCGCCTCCCAGGTTCAAGTGATTCT 
CCTGCCTCAGCCTCCCATGCCTCCTGTGTAGCTGGGATTACAGGCACCCA 
TCACCATGC CTGGCTAATTTTTGTATTTTCAGTAGAGATGGGGTTTCACC 
ATGTTGGCCAGGCTGGTCTGTAATTCCTGACCTCAAGTGATCCACCCACC 
TTGGCCTCCCAAAGTGCTGGGATTTCAGGTGTGAGCCACCACGCCCAGCC 
CTGTTTAAATTTTTTATAAGTATGTACTACTTTTGTAATCAGAATTATTA 
GAAAGCATTT7ACTGATTTAAAAGCTTAGACATGTTCAAATGCCTGCAAA 
ACTACTTAACACTCAGCTTTAGTTTTTCTAATCCAAAAAGGCCGGGCAGT 
TAATCTTTTTGGTGCCAATGTGAAATTTAAACGGTTTTATGTTTTTCCTG 
TGTTGTGAATGAAAAATATTTCTGAGTGGTGGTT7TTTGACAGGTAGACC 
ATGTCTTGTCrTGTTTCAAAATAAGTATTTCTGATTTTGTAAAATGAAAT 
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AAGC"AGTATTrTAAACATCTGTCAGCTTATGTAGGTAAAAGTAGAAGCA 
TGT rf 'TGTACACTGTTGTAGTTATAGTGACAGCTTTCCATGTTGAGGTTCT 

CAT AT C ACCTTGTATCTTGAAGTTT CATGTGAGTTTTTAC CATTAGGATG 

ATTAAGATGTATATAGGACAAAATATTAAGTCTTTCCTTTACCTAAGTTT 

GCTT'-C-TGACTAGTAATAGTAGTAGATATTTCTGTAATAAATGTTCTCT 

CAAGATCCTTAAAATCTCTTGGAAATTATAAAATTATTGGAAAGACAAGA 

ACAGTTTTTATTCATTATATGCATTATTATCG 

>Concig28 

C T>^£rcAAGAAAAGGGAACTGGAGCAATTAAACATATGTAATTTTrTTT 
T AAAAAACCCTAAACCTAAACATCTACCTATATACAAAAATTAATTAACA 
T . „„, r^mn-K » tots a \ xraTnaaarTrTAJLACTTCTAGAAAA 




3AGAAAACCTTTGGTACCTATGACAAt>^-AUAO nu 1 aww>- j. 
TAACACTAGAAGTGTGAACTATACAAGAAAAAATTAATAATTTGAACCTT 
ATGAAAATCAAATTATTTGCTCTCCAAAAGACCCTGTTAAGAGGATGAAA 
ACTAAATTACAGATTGAGAGAAAATATTTGTAAATCACATATrTGACAAT 
GGACTTGTATCTAAAATATCTAAAGAACTCTCAAAACTCAACATTAAAAA 
» a it ATrTAATTAGAAAATGAGTGAACATTTrACGAAAGGGGCCTTATAG 




AAAA1 

AACAC^CATACATTGCTGGTAGAAATGTAAAATGGCATAGCCACTGTGGA 
AAAT T ATTrGGCAGTTCCTTTTAAAACTAAAAATCAATCTACCACACAAC 
CCAGCAATTTCATTACAGGGCATATATCCCAGAGAAATGAAGATTTATGA 
-r»r>rsiii a T rTGT A C ACAAATGTTTT ATGGTCACTTT ATT CATAAT A 



AGCACTAATGGCCCCTGGCCTCCAGGAATTOK^-^^^ 
CTTTGAAACCTTGGCCAAGTCGCTTCTCTTCTCTGGTCCTCAATTTTTCA 

TCTTCA^TGAAGATT^TGACTACTAAAATCTC^ 

^ a «rra.r^riJVGACAGAGGGCAGACACCAAGACATGG 




win* 
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GTGGTATAGACTAGGGCAGGACAAAGAACCTAAATC C7CATTTTCTAAAG 
A7AA77G77AA7ACG7AAAAC7CAAAA77CAAGAAG7AACAG7AAAAGCG 
G7CA77AAGAAACAAGCAC7AAACACCAGA7AGGAAGCGAGAGA7GGGGG 
AAGAGGGCGACAATCTGATTATTTTTTGCAACAAATTTTGTAAAACCATT 
TGACTGTTTACATGTAGAACTTGGATCTTTTTTAAAAAACACAAAATAAT 
AA7AC7A77A777777AAC7GGA7TT77GAAAAAGAAGA7AAAAG7C7CA 




* 4. vj«.v»vjw\»jviv.a^>\ l . WAtivi i TGGG 

AGTTCGAGACCAGCCTGACCAACACGGAGAAACCCCGTCTCTACTAAAAA 

TACAAAATTAGCTGGGCGTTGTGCATGCCTGTAATCCCAGCTACTCGGGA 

GGCTGAGGCAGGAGAATTGCTTGAACCCGGGAGGCAGAGGT7GCAGTGAG 

CCGAGATCACACCATTGCACTCCAGCCTGGGCAACAAGAGTGAAACTCCA 

"CTCAAAAAAAAAAAAAAAAAAAAATTAAAACCTCTGGAAGTTGAGTTTG 

CAAATATTCATTATGCTCATTTTTAACTTGTATGTTTGGAAAATGTCATG 

ATGAAAATTGAGGTTGGGGGATGAGAAAAAAAGAAAAACATCAACCCCAC 

AGC CCATT CAATTTTCAGCCCGAC CCACAGC7CCGGGGAAGGGCAGCAGG 

TCCATCCTTCACTCTTTCTTCACCTCTTTCCCCTCCTTCTGGCTCTTCCA 

CCTCTAATTTGGAGCCCAAAAAAAGGCACTGGGAAATGGAAAAGTCTTTT 

GTACGTGGTACTTGCCGGGGAAGCTGCCATGAAAACCTGGCCCCACGGTG 

GGGAGGGAATGCCCANCTGAGGCCTCGTGCCCATGCTAGGATAGACTCGT 

CCAAACATGTCAGGTGGTCTGACAGGGCAAGCANCANGAAATCATGTATG 

AG7A7GAAC7GA7C7G7A7GCAAGGGCGGGGAGAACACGCGGAGGAA7GG 

GGCG7GAGAAAACAGCACAGTACGTTTCTTTAGCAGCTGTCTCTGCTCAG 

CCATGGGAGGTCACAGAGAAAGAGGCTTGGAGGCGTTATTTTCACTGTGA 

GA7G7GAG7G7AAAAAAG7GCCCAAGACACAG7GAG7ACCAGGGAGA7GC 

CCTCTTTCCTACCCGAATGCAGAATGGCCACAGGCCTTAAAACACACACA 

TGGGTCCTCAGAGGAGAGAGGCCTCCACAGTGGACACCCGCATTCTCCCC 

TGGTCAGCAGCAGCAGGGCGAGTGCTGGGCCATCATGAAGCTTCACAGGC 

AATGAGCTCTCAGCAATAACAGGAACAGTGCCTGGGGGACTGTAGCTGCA 

AGACCGATTTTCATGTAAGATGGCCTCTGAGGACTCCGAGATACACCAGG 

CTGAGACTAGCTGGCAGCTCCAAGTTGTTGGTCAGAAGAGAACAGGAACT 

AGGGAAATTGGAA TTAC TGTTACTACAATTCCTTTACATCCGCACAACCA 

TGAGGTCCAGCGATTTTCTATTATT T Tl'Tr r 'ri T AAGACAGGGTCTCAGT 

ATGTCGCCCAGCATAGAGTGCATTGATGTGATCATGGTTCAGTACAGTAT 

TCACGTCCCAGGCTCAAGTGACCCTCCTGCCTCAGCCTCTCAAGTGGCTG 

GGACAGCAGTTGCATGCTACCAGGCCAGGCTTTTTTTTTTTTTTTTTTTA 

G777C7G7AGAGCACA7AGC 

>Concig30 

GGTTAACAATGGCACAGGGAAACAAACAGTTCCAGGTGCAGGGGCTCTAA 
A 7C7A 7CA7AAGA7G7TAGGTA7GGGGGC7C7GCCGGACACAAAC7CAAG 
GCTTTATGCTGTTATCTCTTGAGCGAAATCCTGGGAACTTCGTACATTGC 
TTGCTTCAGTACCTTATCAGTTAATCGGACTCTTTGATATGTTGGGAGTC 
AGCGTACACAAGTTAACTCCTTGAGGAAGGGGGTGGGTAAGGAGTCCTTG 
ATGTCTGGTAAATGAAGGAGCGAAATCGAGTTCCTCTGGCTTTCTCAGCT 
AAGGGAGAGCTTATTCATGTGGAAACAAGGCTAAGTGATTAAGGGAGAAA 
GGGAGAGTCTGAAAACAAGGTTAGGTATTACAATGTCAATAAAATTGGTC 
TCCTTATACAG TCCTAT GGTAGATTTCTTTCCATCTTTAATCTCCCTCTA 
GCACCACCAGACTTTTTCTCTCTGTACCTTGAGATGTAAATTTTGCTATC 
TGAATTTTCGTCTAAGAGTTGTTTCCTTTAATATGCAAATTTAGGGTTAT 
TTAGCTGACAACTGCCAAAGTAGTGAAACAAGTTATCAAGAACTTGAACG 
TCTAAGGTAGGAAAAAAAAAAGTCTTTATGAATCTATAAGATGTACTTCT 
ATTGGCATGCCTAATACGTCTATGTATTTACGTGTTGTGTACACAGTTTT 
TCACTACTGAAAATATATAGAGGAGTTCTAATTAATTGACTTAAGACAAT 
AAAAGCGCTTGAATCAAATACCTTATCAGGAAAAAGGAAAAGACAAGTCA 
AATGCTTGTTCAAGTTTATATAACTTAAGTAAAATCTTTAATAAATAAGC 
TAGCTTTAACATTATTTGAAATGTCTTAAGAATTGCCAGCAGGTTCTGGG 
TTACAGAACTAGTGGGGGTGCAGTGGGGTGAGGGTTGGTGGGGTGGGGGG 
TGGTACGGGGGCTTTGTTTTTTCTTGCTGCCCCCTTCTGGGTTGGGGAAG 
7GGCAGGACCTTGGCAGCACCCCGAGCCGGCATGGCGTTAATAATGGAGG 
GA7GCCAGACCCAAG7GGC7AAGGCCCGGC7GCAGAGCCAAG77GGCA77 
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T-CAGACTGGGGCTCGGGCCGCACCCTCTCCAGGACCCTCCCCTTGTACC 
GAGCAGATTGTCGCGGGCAGTTTGGGCCAGCTGTCC7GGCGTGGAATTTC 




"CCATC wA i uwAi i-Uil wV-n. i j. v- - w * w« * 

GGAT CATTACTTTTGGGATGTGGAT ATATT CAGTTAACAAGGAGCAGCT * 
T CAAGAGCTGGATTTTATGCTTTGGGTGAAGTTT AGAAACACTAGCT C C C 

AC* 

AC^CATGTGCTCTAGCGCCTCTTACCTCATGCCCTCCACTCTCAGTCTT 
GCACTCACCCTGCCACACTCAAGGGCTTCCCCAGGTTCCTTCTTAGATTC 
^Tl^ ~ * r_ ~r r a r.r,r. a rT-Tr,r ArATGCTACGGTCTCTGC CTGGCTCCT 




A rGTAACAGTTTC TTUAUAvjAouvjiio iiawuaa v-v- w^- * " „ 

-TAT^ TGTGACTGGTCTGTTGCC7GTCTCCACCACAAGAACATAAGC 
TGCATGTGAACAGGAGCCTTGTCTATCTTGTCACCCCAGTG^TGTGACA 
LljU :„:.„« «T.r>iTYiTTTr:A.TaAATGAAGTGCTG 




GAAGGCTATCCTCTTTCTGCCCCTCAGTGCTCTCCATGCCTCCTCTATGC 
C^CCATGCCTGCTTTCCAACCCTGCAGAGGTGGAGAAGTTGCTAATCTCC 
^TGTTGACATGTGCTGGGGTGCCTTGGGCCAGGGAG^^GGTOTTG 
•i^r^rr. a Ta rmTGGCTGTGCCCAGGTCCATGCTCACTTCCTGAGCCCC 



OTAGCC^TTGTTATATAGAGTTAGCTA^ 

^ACCCAAGAGGGGAGAAGAGATATGTAGTAGGATTTAACCTGCAAATCCT 
"c^G^GAG^A^CGTGCACTACATACAGTGGGTAGCATGTGGTAGGTGCTC 

Xataac-^ttgaccgatctattgaatacacgtaac^tcgtc^cactatct 

AAAACGNGGGGTGTGGGGGAAAAACCCCCCCC7TGTTTAGGAAACCCAAA 
TTGGACCGTGTTGGC 

rCGCGATTGTGCTAAAGATCATGCATGCCTGATCAAACGTCCCCATATGG 

AACCTGGCAGVrTAAGTGATTAATCGGTCCTGTACAACTGTAGCCCCTAG 

C^GGAAGCACTAAGCTTCGTTTTCATTTATTTCCTCCCTGGAAC^^ 

AAATGAGGGATGCCTTCCGCCATGAAGTTTTGCTGATTGTCCACTTTGTT 

CTCAAGGAGATATTCACAGTTTTTAATTTGTCTTTCTCTCCTGCATGGT^ 

TCC^?CTCTC^(3AAGCCAGCTGGCTCCATCATCTGTAA^^CC 

ISg^ccagagcacttc^cttcctgttgcccta 
SStcctSccatgataatgtagtgttactacattita^ctcagc 

^GT AAGAAATGTT ACATT CATTTACTT AAAT CAAATTAAGTCTGCTCACT 
"aGT^CC^ScAGTGACCAACTTATAAAAGAGAAGGTACATTTCAGTCAT 

CACTGAG^^CTCTCTACCACTGGAAAACTGAGGAAGGGT 

raSS^AACATCATTGCCTCTGTTTTTTCTCCTACTCAATGTAACCAT 

^S™^^^CACAAAAAGACWTCrrCACCTCTGCTCTCAA 

IEMg^ 
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TAAATTTTTAGTGAGAAACTTGAGCTTGCTTCTGTGAGATAT-TATT'I- 
AAAACAGATTTGACACTTAAAATGTCTAATCAAGCCTTTTAAACCATGAT 
CTATCTCTTCAAATTCTTCAGATGCCACCATO^ 

:ApiIl^r^^^ TCTCGCT ^ GTCGCC ^ GG ^GGAGTGCAGTGG 




AAGTGCTGG^TGACAGGCGTGAGCCA 

TTTTTTGCCTGATGAAATTTTCCTTGCCACTACTCTGGATG^T^GATAC 
ITTAAATTGTGCTrrr irawn r * m^ZZZt 7T_irr iT TGATAC 





TTGC CAAAAGGCAC CTTTTTCTTGTTC CTTTACTTCACTGGT CCGAAGCC 
~ GG H^£ ^ AA £^^ AC ^^^^CTGGGAAATGAGCAATT^GCCACGT 
. j. AGCTATTAATGGTGGCACTCCATAACTAGCATCTTAAGCTCAAT 
I^ G ^ G ^ TGTG ^TCTTATTTTGTACTTGCAGGCAC^?^ 
- .TGTAATCrrrTATTCATACTTTAAAATTAAAACAGAGTAATAGAACCC 
ATAGAAGGAAATCAATACCCACGAGTCCATACTGATATAAATAAATAGTT 
ACATAAATAAATGGG^GGAGAAATAACAGCTCTTCCTTACAGAAAAATT^ 

caattaataaatgaagaaggaattagggaaatacaacg^ccatSc 

AACCACAGTAATAATCATTACAGGCAATATCCAAAAATAAAT^CAAAGC 
CAGTGGGCAAAAGTTTGAGGAGATACAGGATATTAACATAGTCTCCAAAT 

agctcatgctatttataaattacaaaaggaaacataacaactgtatagtg 
aagaaactcagcagacaccaccttacx:caagtgat^^ 

TAGTAATAGGGCTTGTTGACATACTGGACTm a rrrria. t> r*n r»rv* * ™ » 



j. x ^ lU^aTATTCTTACCAAAAACAGAATTCTAATGTAA 

TTAAGGAAAATGTCAGACAAACCTATTCTGAGAAACATTCTATAAAACAA 
CTAAC CAATACTTTCAAAATTGTCAAGGTCa Tllir.irrs rwn iTrow 




. unuunum. ^•'^wiAUAI ALAACAACTAAATACACAAATGGAA 

:CATGGCATTCTTGATTGGATCTTGAAACAGAAAAAGGATATTAGGAAGA 
^^^^H^^^^^^^AA^AATAGTATTGTA 
v-uAATATTAATTTCCTAGATTTGATCATTATACTATGGTTAAGTTTTTAA 
CATrAGAGGAATCTGGGAGAATGGTATATATGAACTCCACTGTTCATTCA 
ACTTTTTCAGTAACTATTATTTCAAAATAAAGTT 
>Concig33 

GGGAGCGGCGGCCCACGCTGATCTCTAAAGCTTTAGACCACAITGGCTCG 
AGCATGGTCATGGCCGTTTCCTG ««*.wtv^i i w*. I «. 

>Concig34 

GACGTCTTAGCGCTATATTATAAAGAAATATTCACCTCCCTGCTGAGCTT 

ACAGGGTGTACCTAATGTCCAACAATATGAAATCTCTTCAATGAATTGCA 

G ^^^ A T A TAACCCACAT(^GCTGTCCTCTTTCCTCACCTTCG 

AACTTCCCATGCCAAAGAGGGACCTCTTGGACTCAAATACATCTTAGCAA 

TATAGAAGATGCTGGAGACTTGTAGGAGAAGTGGAGAGGGTTTACAGTGT 
AGCCCCACAGAAAACAACTTATGarrrraTrart^a/-^^r~T<^^^^m-»~, 




^ 1U ^' iv> i av. i UAVaijAKAtCACTAGATCCTGGATGGCTrCTTCT 
CCCTTCCCCTCCTTTCTCTTCTCCTCTCCCTCCCTTGCTCCTCCTTCCTC 
CATCACCCACTCCTTACTTCCAACCAAAACTTGACTAGCTCCAGTCTCAT 
CCCTCCTTATTGAAAACTATTTTACTCAGCCCTCCTCCCCCACTCCTGCC 
CAATCTTTATTCCTTACCTACATCAGACTTCACCAAAACAAAGGCCAGGA 
TAATAAACAGGACAAACTCTTTCAAACACATTTTAATGACCATATTTTGT 
TATTTTGGTACAATTTGAGGAGTCCCAATCCCCAGGGAAGACTAACAAGA 
AGTTCTCCTAACAAAGGTGGGTCTCCCCTTACTAAAAACTCCTGTAATGG 
CTGAAAAGAGCATGAGGTTTTCTGCATATCATTACACATTCAATAGAACG 
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TCATGCAGCTGTTAAAA^xmTCTGTAa^GGCTATCTTGTGACAGAAA^ 




ACAGACATTAAACATTCATAGTGCTTAAGATGATGCATAGTATAATAGTT 

AGGACCATGGCCTTTGGGACAGAAAACTACAGC CT CTCTC C CACTTAT CA 
GCCATGGGACCTTGGGCAATTTGCTCAGCCTCAAAGCCCCTGTTCCTTTA 
TCTGTGTGCTGGGGTTGTTGTAAGAGTTAAGTGCAATACACAGAGAGAGA 
GAGAGTACCTAACATGTATTATGTGCTCAGTCAATATGCATCATAGTACT 
CATTGTTACATATG7TCCTAAGTGCTTTATACGTTTTTTCCCTAAGTTGA 
rCATCr-GTTTTTGGCATTATGAAACATAATGATCCTAACAAATTAAAATT 




GAGAGAAGTCTCCAGAAGGTCTTTCGTGTGTTAAAGAAGAGGGTAACAGG 




TG^GTAGAAATGGATGTGCAGCCACCTCATCTTTTTCTACCTCCTTCAC 
ATGT^T' r AGATAATTTAATGTAGTAGAAGACGGTTACAGCAAAAAATTAC 
AAAAATCAAAA^TCTCTGCTATCTACTGTTGCATTTCTAACCATCCC^ 




TATTTTG7TCAAGATCTQTGGTT i Wj^uiwwwwwv. * w^~«-~.— - 
TCTCGTCTCCGTTCCCGCAGTGTCAGGAAGTGTAACTGAGACTGGAGGGT 




AGTATCACAACCTACAGAAGAGTGGAGGAATATTGAAAGTTCACfl^TCTC 
A^TCATCAGTCACAGAACCTGCACCAATTCAAGAGGAGAACATATAGAGG 

Stotctc^gg<»taagtgtcaacaaatttgcatctatcacaatctg 

caacctcccaggggtcgcaaaagcctcatccatttatggcaaatgtggcc 
ct^taat^atataaaataatttgcgggggcttcctttatatttttaac 

tcccctgc 

TTTtS^GCAGGGGACTAAAAAAATGATACGAGTCAT^ 
TgE^oScTAAACSA^^ 

AAATCTAATAAAACAAAATTTGTTAAACACATTTACCAAATTTTTTTTTT 
T^TAAA^TTCA^TGTCATAAATAAAGCTAAAGTTCCTCTTGATGACT 
r^TCC^CTGCCCWTTCCACTCCAAGTAACCACTATTATCAGTCTTGCC 

^gS™g^gtgcagtgttttgttttacgtaaatgttatcactct 

mcSG^CATAAmCCTTTrrCTCTCAATGATTTGCTTG^TATC 
TSSScAGTAGCATCTCCTTTCTTTTTAACTrACC^ 



^irrr-GCC^CTATCAACAGATATGTAGGTTGTTrCTAGTTGATTTCATT 

^^a^G^^^^CTCTTC^TGGCATGTCTTCCCTTTTGTATCTGAGAGAG 
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AAGCTCACAGGCACCTCUGCTTTCITCl^GACCACCTGCTTTCCTGCCa 
CCAGC7CTGTTCCATCTTATAGAATGGTTGCCACTTGGGTGTCTGCTCCG 
ACAGCCA7G7CA7CC7TTGCAC7GCAG77A7GAAGCAGACAGAGC7AGGA 
GAGGGGCTTTGCCAGCCTCTGCCCTAGCTTGGAGAACTTCAAAAAAGGAG 
GGTATTGAAGTTGAACTCCCCCAAAAAGGGGTGGTCCCCACACCTCAAAA 
AG7GG7GCC7CCGAAAGAAATG7AAAATTCG7GTGGGGGGGGAAAAAGGT 
TAJTTAGAAATTGTTGGCTTGTCGTGCCGAAAGTATGTGTGGTTACGGGG 
AGTACGGAAATTTCGAGGGGTGGGGGCGAGGCCGTGTGTCCTTTAGCCCG 

GGGTTTTCCCGTCGCATGTTTAAGGGGGGGGAAGAGGGGGGATGTTTTCT 
77CCGCGAAGG7TTT7GAAGAACGGCGTGG 

>Concig36 

CCCCCCACCGCCACTACTCAACCGGCCGTTCACGAAACAACTCGCCACAT 

CCACTAACCCGCTGGCTCACCACCCACCGCCCTCCCGATCCCCCCAATCC 

AAACTCAACCCCCACCACCAAGCGCCTCCCCCCTCCCCCACCCTCCAGCT 

CAGCCCCAACCTACCACCAACCCCGACTCGCCCACCGAAAACCAACAGCA 

AACCCAAATGCCCACAAAACCAGTGTCCAAACCCTCCTTCCCATCAGTTT 

GGTGGGCCCATCACCGCTTCCCCTGGCCCAGGCTCTCCTTTTGTGCGCTT 

GGAGCAGCAGACTGATCTCCCAGCCTTCACTCACTTCATGTGGTAATCTG 

TTGTGTTCATCACTGTCAGAATCTTCTGCATCCCCTCACTACTCTGCTGA 
AAACACTCTAGTGGTTCCTCATTRrTra tti & rrsn & , 




TTTTCTCCTTACTCTGACAGACTCTCCGTCTGTCATTTATGTATTCTTTT 
A77GC7C7C7TCTACTTTTAG7A7GAACTGGATTTATGGATTTTTTTAAC 
ATTGCTTTCAAGTATGGAATAAAGAATTTTATTTATTTATTTATTTATT7 
ATTTGAGACTGGGTCTCACTCTGTTGCCCAGGCCAGAATGCAATGGTGCA 
GTCATATCTCACTGTAACCTCGAATTCCTAGGCTCAAGCCATCCTCCTGC 
CTCAGCCTCCTAAGTAGCTATGACTACGGG7GTGCATCACCACATCTGGC 
TAATGGAATAAAATATTACAATGCCTAATCTTAATTTTCAAAATTTTAAA 
TTACATTGTACCTAATGCCCATGCATTTACTTTTTTCAGTGGGTCAATAG 
CCCTCACT TTGGCA AAGGTCCCAGGCCCAAGGTAAGGCCTrACTTTTTCC 
AAACTCATCTTTTGAAAGACATAAGTGCCTGTAAGTTGTACCACATTAGG 
TTCTAGGAATTTTTCATCAAAGACTTTATCAGACTATTTTCCTCTAAGTT 
GAGAAAGAGCTGGGGGCAGAATATGGCACTGAATGACTGAAGAGAAGGCA 
CTGAAATCAGGCCAGAGGTTGCTGGAAAGAGCAATGAGGAACACCAGCAG 
CAATGAGGAGC C GGTGATGATTTTGGCTT CACAGGGAGGTG7GTAC CACA 
CCGA7777A7C7C7ACG7GGA7GAACCACAGC7G7CGGC7CCC77G7C7C 
CAGGACA7CACAC7C7CCACATTCCCTCCCATC77CCGGC7TCTGCTTCC 
CGGGGCCC7CATC7GCCCCATCC7GGG7GAACAC7GGTCGGTCAAC7GC7 
GGGCG7ACC77CCCGC7C7GCACACCCTCCC7GGCCACCCCACCCAC7C7 
CACGGC7CGCACTGCAGAGGAGCCGCATCTCTAGCTCCAGCCCATCTGCC 
TC77C7GAGC7C7AAC77CA7G7AGGCGAC7CC7GCCGG7G77GCC7CAC 
AGGCCCA7CA7AC77CAAAGCA7777CCCC7CAGAACACCA7G7CC7GGC 
7GCTCCCTCCAGAAGATACATCTCTCAAGCACATCCCCGCGGCTCTCACC 




ACACCTCAAAACTGCCTTCAGGATAAACCACTGCCCTTGACATGTTCACA 
GGTTGCCCATGGCCTGGCCCTGCCCATCTCTTCAGCCTCATCTCATGCCC 
CTTGCCCCTCGCTCrCTGGGCTTCTGCCTCCCTAGCCCTCCTTTAGGTTC 
TCTAACACACCATAGTCCTTCTAGTGTTGGGGCCTCTGCAAGTGCTGTTC 




TCTCTGACTTTTTAA ACTA ATCAGGGTCTCCCCAGTATATATCTTCATAG 

CACTCTGTATTACTCCTTTCTTAATGACCACCTGCTGTAGACAGAATGTT 

TGTCTTCCTCCAAAATCATATGTAAAACCTTCCACCAGAGCGATGATTAG 
AGAAGCCTCCC 

>Concig37 




ATAATTrrTTCCACTTTCCTACTGTTTCTTGTTCTGGGCTCTGTCCTGCT 
TACTGTCTAATATCTTGGCCCCTTAAAAGTTGCTAATCTTCCAAACCTCA 
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r~CC""GTGACTGGGCCU^TGGTCCTTG\ . CATGGGCCTTGAAGATAC1 

CTGTACACTTATCTGGAGCATCCAGTGCCTACCACCTGACCCAGATTCCT 

r-ATTGCGCTCCTCCCTCCTCCACCTAATGGGATTTGCTCATACCCGTGTG 

GGACCCCTCCCATTTTCCCCAACTGAATACTTATCAAGACAACGCATTGC 

~A~ACTCCCTCGTACCCTGCTCTGGGCATCAGACTGAATGTTTGTTTCCA 

"TGAGGATCTGCAGCTGCATCAGTTTCCCCAGCACCGTCCAACCCCTTGA 

GCATGGCTAGTCCTAAAGCAGAGAATTAGCCTTTCTATCCCTGCTGCTAT 

ACATGCT3GGACAAATAATAAGAAATGACAGCATTTTATGATAATGCAGG 

<~' r GCAGGAGGCAGGAGGCAGGAATCAAATTCGTGCTTATCAAATAGTGCT 

-CAATTCTTTGAATATTGGACTATAGAATATGTCATGGATCTATGCTCAG 

GTGGGTTCCCTATTACTCACTCCACTGAGGCCAGGTTGTGGGATTAGCTG 

TCCAAGAGGGAGTT7CAGTCTCACAGCATAGGGTCATTCTGAGAATTACT 

GGCCCACACTTGTGTGGAGACCTCCAGAGAACAGAATCTGGGTTGGTGCC 

ATGTACTTCCAGGAGGAGAGAAGTGGCAGGATGCCCAGCCCCACAATCAG 

AGGGGAAGGGGCAGAGCCACATGTATGAAGATCCTCTCCCCAGTACGTGC 

CAATCACAGGGCTTCCTAGCTTTTGGGCCAAGGAAACAATGTGGGAAGCA 

AAAAAGGACAATTTTCTCCTCCCTTTGCATGAAGACTGAGCAGTTTTACC 

AGATTCCCAGGGAAACACCCTTCCACTCTGGGTTGAATGTGAGTGAGAGA 

"ATT C AGCT GGAACACTAGAAAAACT ATTT C CTGAGC CACT CACCTTTAG 

CCCTAGAAAGTGTTGGATTTGTCCTTCATCTTTGCCACAGTAGAGACTGC 

TGATAGCATCAGAACTTGGGCTCTGGAATTAGACAGATA TGGG TACAAAT 

CTGAGCTCTCTCACTTATTAGTGTGGGATGTAGAGCAACTTTTAAAATCC 

-TCCAAACCTCAGACTTCTCATGCATGATGTGAGGATTGTAATAGGGCCC 

ACC^AATAGGGGTTTTTGAGAATTAAAAAAGTTATTCAATGAACAGCATT 

"AGCAAGATGCCTGACCATTGAGAAAATAACAAATTGTTTATTATTATTG 

TTATTATTAAACATCTTTCCTGCACCTTCTGACTGGGGGCATCGTATCAT 

CAGAAATACTTAGGATGGGATGGATTCCTGCATGGGCTGAGTCAAGGGTG 

CAATAATGGAGGAGTGAAGAAGGAAGAAATGGAGGCAGAAATCCCCAGGA 

GCCCAGCATGGTACAAGGCTGAGCTAGTGCTGCAGAGCCTCCTTGGAACA 

GCCACAGAGCTTGCATCTGGCCCTGGAGGAACCTCTTCTAGCTGGCAGGA 




AGAGAATAATTTTAAGGTGTCATTCTCCCCGCCAATGCCATAAGCACACG 
3CC~C-GCCTGGGTTTCTCACTGTGGAATGTCCTCCTGGTCTCCTCATGC 
-CAGAGAGTGGGAAGTACTCCTACTTTAACACCGGCTTTCCTGTCATTTC 
^OTGCAGCCCTCCTCAGCCCCCTCTGCACAGGGAGGTTTCCTCC^.^CTG 
CTGCAGTGCTTTGTACTTGTTAGTGGTACCTGCACACAGGTATTGGTGTC 




TTCTGTAGAGCACATGGACGCC 

GACTAATAAGTACTTCATTATTTGGGTATTTTCCAAGAACAACATATTGT 

AGGAAACCATTCTTTCTAAAAAAAAAAGTGTCCTTTTAAAAAGGTGAATA 

ATTTTT 3T CTAATTC AAAGTTT ATTGAAAAGTT ATGT ATAAAACAAGGTA 

AAAGGAACAAGGAAATAAGGGAAATGTAAAGAAAATTATAGAAATAAAGT 

GGTATTTrTTGGTAAGAAAGCTTAAAGAGAAATAATTTTAGGTAAGAAAG 

AATCTTACCTAAAATTTTGTGCTAGAATAAAGTGACTGGCTAAGAAAGGG 

ATGTTCAAAGCTATTTATGACAAACCCACAGCCAATATCATACTGAATGG 

GCAAAAGCTGGAAACATTCCCTTTGAGAACTGGCACAAGACAAGGATGTC 

CTCTCTCACCACTCCTATTCAACATAGTATCGGAAGTTCTGGCCAGGGCA 

ATCAAGCAAGAGAAAGAAATAAAGGGTATTCAAATAGGAAGAGAGGAAGT 

CAAATTTTCTCCGTTTGCAGATGCATGATTGCATATTTAGAAAACCCCAT 

CATTTCAGCCCCAAAACTCCTTAAGCTGATAAGCAACTTCAGCA^GTCT 

CAGGATACAAAATCAATGTGCAAAAATCACAGGCATTCCTATACACCA^ 

AATAGAC-AACAGAGAGCCAAATCATGAGTGAACTCCCATTCACAATTGC 

T^CAAAGAGAATAAAATACCTGGGAATACAACTTACAATGGACATGAAAG 
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AC w . * a . wAGGGTGAAu . GCAAACCAC* ^CTCAAGGAAATAAGAGAGu^A 

ACAAACAAA7GGAAAAACA77CCA7GC77A7GGA7AGGAAGAA7CAA7A7 

CGTGAAAATGGCCATACTGCCCAAGTAATTTATAGATTCAATGCTATCCC 

CATCAAGCTACCATTGACTTTCTTCACAGAATTAGAAAAAACTAATAGCC 

AAGACAATCCTAAGCAAAAAGAACAAAGCTGGAGGCATTG "GCTACCTGA 

C77CAAAC7A7AC7ACAAGGC7GCAG7AACCAAAACAGCA7GG7AC7GG7 

A CpAAAACAGATAT ATAGAC CAAAAGAACAGAACAGAGGC CT CAGATATA 

ACACCACACATCTACAACCATCTGATCTTTGACAAACCTAACAAAAATAA 

GCAATGGGGAAAATAATTCCCTATTTAATAAATGATGTTGGGAAAACTGG 

TTAGCCATATGCTGAAAACTGAAACTGGACCCCTTCCTTACAACTTATAC 

AAAAA7CAAC7CAAGATGGATTAAAGATTTAAACA7GGCTGGGCATGGTG 

GCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGATGGGTGGATCAT 

GAGGTCAGGAGATGGAGACCATCCTGACTAACACAGTGAAACCCTGTCTC 

TACTAAAAAATACAAAAAATTAGCTGGGCATGGTGGTGGGCGCCTGTAAT 

CCCAGCTACTTGGGAAGCTAAGGCAGGAGAATGGTGTGAACCCAGGAAGT 

GGAGGTTGCAGTGAGCCAAGATCACGCCACTGCACTCTAGCCTGGGCAAC 

AGAGTGAGACTCCATCTCAATAAATAAATAAATATGGAACTCTCCCAACA 

CAATAATAAGACAAACC CC CAAATGTTTTAAATGGGCAAAAATATTTGAA 

CAGACACTTCACAAAAGAGGATATGTAAATGGTCAAAAAGCACATGAAAA 

GATGTTCAACACCATTGGTCATCAGGGCAAAGAAAACTAGAACCACAATG 

AGA7GCC7C7G7ACACCAC77AAA7G7CCAAA77AAAGAAAACAAGT777 

3GCAAAGTTGTGGAGCAACTGAAATGCTCGTGTATTGCTGGTAGAAAAAC 

AAAATGG CATAAC CATCGCAGATAATTTGTTGTCAGTTTCTTACAAAGTT 

AAACATATACTTATTGATATGACAGTTCCATTCCAAGAGAAATGAAAACA 

TAAGTCCACACAAAGACTTGTACCTGGGTGTTCATGGTAGCTCTATTCAT 

AATTGCCAAAATCTGGAAACAAATCAAATGTCCATCAGCAATGGAATGGA 

TATACAAATTGTGGTACACATGTACAATAGAAAACTACTCTGCAATGGAG 

AGAAATTAACCATTGACAAACACAAAAACATGGACAAACCTCAAAAACAT 

TATGCTGAGCAAAAGAAGCCAGACACAAAAGACTGCTCAGCGCATGATTC 

CATTCATATGAAATCACAGAAAGGGTCAGTTGAAGGTGCAGAGACAAAAA 

GTAGATCTGCAGTTGCCTGGGGATGGGGTGGGAGGTTGACTGCTCTGACG 

CGTAAGGAAATTTGGGGGTAGGTGGGGGATGGTGGGAATATTTTTTGAAT 

TGAATTGGGTAATAGTTTTAATAGGTAAAATATTGGAC CC CACAGTATTT 

GAGATAGGTTTCAGTCAATTTAGACAGTTTATTTTGCCAAGGTTAAGGAT 

GCATCCGTGACCCAGCCTCAGGAGGTCCTGACAACCTGTGCTGAAGGCAG 

TCAACATACAGCTTGCTTTTATTCATCTTAGGGAGACATAATACATCAAT 

CAATGCATGTAAGGTTTACATTGGTTCAATCTGGAAAGGTGAGGGAACTT 

GAAGCAGGGAGCTTC CAGGTTACAAGGTAGATTATTCTCAACAGAAAGGA 

ATGTCTGGGTTATGATAAGCGGTTGTGGAGACCAAGGTT7TATCT7GTAG 

ATGAAGCCTCwGGGTAGCAAGCTTCAGAGGGAATAGATTGTCAAAGTTTC 

CTATCAGACATAAGGTCTGTGTTGATGTTAATGCTGGTCAGCTTTTCCTG 

AAT7CCAAAAGGGAGAAGGGTATACTGGGGCATGTCCAACCTTCCCTTCC 

ATCATGACCTGAACTAGTTTTTTCAGGTTAACTTTGGAATGCTCTTGGCC 

AAGAAGAGGGGTCCA TTCAGA TGGTTGGGGGGGCTTAGAATTTTATTTTT 

GGTTTACAGTGAAGACTTTTCAAGCTAGAC/VCTTAAATGAGTATGTTGCA 

AAATGGCAATTTCTTAGCACGGC 

>Contig39 

GACGTCCTAAAGAAATGCTAAGGTAACTCAATTAACTATGCTAGAAAAGA 
GAGTTAAGTATTTAGGAGGATTTAATATGGTGTTAAAGTTGTGAAAATCA 
AAATGGAGACACTAATGTTAAGAAAACCCTGATAAATGGAGCCAGGGAAG 
GCCATGAAGAAAGAGTTCTCACACTTGTATCCCTGATCATGAAAAAGACT 
CTGCAAAAAACAAAACCTTGCACAAAGGCCATTGCAACCTTACACAAAAA 
ATACTACTTTAAAAGGACATGTGCCCAGCAACTGCCTGTCCAACCTCAGA 
CTGGCAATATCTTTGTTATTGATCTTAGTAGCCCAGCATAACTATTTCAA 
AACAGTGA TGTAA TGCTC A TT I ' 1 1 T ITCrm 'GAAAACTTTTGTCTTCCT 
37AAAAACC777G7C77C7TTAC77ACCC7GAA7ATGCACAGAGT7TAC7 
ATGGAGTGCATATTCCTGTTGCAATGC7CTATTCCCAAACAAACATCATT 
TTCTTTTAGAGAGCCTCTCTCTGTTTGTGATTTAGGTTGGTGATGTAAAG 
CAATGGCATAACTGAACACTGATTCAAAGAAAAGTGGCTTTTCTCTTTGT 
TGTA7TAAAAAGAGGCCTTATAAATAGGATAG7AAGATTTGTAAGTTGAA 
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GT~CAAGACCAGTCTGGTCAA^^^ 

^T^GGCCAGGTGCAGTGGCTCATGCCTGT^CAGC^^GA 
GGCCAAGGCGGGAGGATCACTTGAGGTCAGGAGTTCGTGAC^GCCTG^ 




T^C C AAACTCC CTGTT AAAATCC CT G C C C CAGC GAACTTTT ATTC AATTT 
TGTG^AATGGAGGCTGCTCTGATTTAAAT^ 

-ac?ccatgtcccagatccctagttgttttt:gttttttgtt^ 

ararr'"^C"TGTGTCTTCCATGCTGGAGTGCAG7GGCATGATCATGGCTC 
^E3?S^*S^nrtrtar:TT(^(MTATGTGCTACCATGCCTAGCTAATTTTTT 




CATATACGCGTCTCT 




^^^qqcagcStctggtagtaggtgcctttagaaaagtcc^ 

^---^^mCTGAGcL^^^ 

ar r * TAATACGGTTGATCCAGGGCCTAGGACTCATTCCTTCAT 

TAcS?CCACT?^GTCTTAAAGCAA<MCTCrA^^ 
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»- ATTT CCTGTC CCAGCTC1T TAGTTT CATTAAACAAAGGCTTTAGAAAAC. 
TCCCAGTAGATGCCTATGTTGCTTCCTTTTAAAAAATTTTGGAGC^GTTT 
CCCTAGCCTAACCTT7TCTTCAGGGCAGGAGTTAAGTCCCTTCTACTGCA 
TTCCTGTGAAGATGGTGATTCAAGAGGCAGGGCACCTGTTGCTTTGTGAA 

acagtccactctgcagctgggcagctctgttactagaatgSctcccttc 




„ ^ ^ ^ A ^ hjoaLjC lUAi iUCTTTCCTCZ 

AAGGCAGCTC^GTTTTTCCCAGTTTGCTCCTGGTGGTCCTGAAATATGAT 

TGACTCCTGAATACTCCAGGTGTGAAGGAGAGTGGGGGTGGCCTTTCTAC 
TTGTCATGGCCTGGGTTTTAAGTTGCTfJTr r a tyrnn. mnnnn n^m^ » 




— ii.uy.wi.ci 1 V. 1 UA^AliCKiCGTGTTCCA 

AGCCCCTCAGTGAATGCCTGAAACTGTGGATAGTACCCAACTCiATATGT 

CTATGATTTTCCTATAAATTAATACATGCCTGTGACAATGTTTAATT T AT 

AAATTAGGCAAAGAGGCCAGGCGCAGTGGCTCAAGCCTGTAATCCCAGCA 

CTTTAGGAGGCTGAGGCCTCACCTGAGGTCAGGAGTTCGAGACCAGCCTG 
ACCAACATGGAGAAACCCCGCCTCTACT A iiUTsrs & & » t«p» ^~tw„,~ 



~ j. w« j. ^w^avjvj^uv-v. i iAA i uLUAGCTACTCGGGAGGCTGAGGCAGGAG 
AATCACTTGAACCCGGGAGGCGGGATTTGCGGTGAGCTGAGATCGTCTCA 
TTGCACTACAGCCTGGGCAACAAGAGTGAAACTCCGACTCAAAAAAAAAA 
AAAAAAATTAGGCAAAGAAAGAAATTAACAACAATAAGTAATGAAATAGA 
ACAATTCTAACAATATACTATAAXAAAAGTTGTATGAATGTGGTCTCTTT 
CTCAAAATTACCTTTTTTTTTTGAGACAGGGTCTCACTTTATTGCCCAGG 
CTGGAGTGCAGTGGCACGATCACAGCTTACTGCTGCCTCGACCTCCTGGG 
ACCAAGTGATCCTCCCACTTTAGCCTCCTGAGTAGCTGGGACCACAGGCA 
TGCACCACTGTATCTGGATAATTTTGTTTATTTTTTTTTGCAGAGAGAGG 
AGGTCTCACTATGTTTCCCAGGCTGGTTTTGAATGCCTGGGCCCAAGGGA 
TCCTCCTGCCTTGGCCTCCCAAAGTATTGGGATTACAAGCGTGAGCCACC 

ATGCCTGCCCCAAAATTATCTTATTGTTCTATACCCACTCTTCTTCTTGT 
GATGATGTGAGGTGATravTTf5rr-rrrT«rr'»Tr!ii/-!»>T»^» » ,-.».~,. „ 




.«™wiinu4vimguwiui, i i\jA i A TTGGCCTGATGATA 

TGTCAGAAGGAGGGTCATCTGCTTCGGTGATCCTGGATCATAGAGTCATG 
ATGATGTCAATGGTTGGATGTCAGGAGCAGACGATGTCAATGACTAACGA 
TAAGCTGGACAGGTGGGATGGTGGCACAAGATTTTATCACGCTACTCAGA 
ATGGAGCACAATTTAAAACTTCTGAATTGTTTATTTTTGGAATTTTTCAT 
TAATATTTTTGGATTGCAGTTGACTGTGGGTAACTGAAACTGTGGAATGT 




TAGGGATCCAATTGTTTCATTTCTCGTTGTAGCAAAAACTTGGCTTTGGA 
ATCAGACAGATTGATGTTTGCTATCATTCTAAATGGGTGCAGCTACACTT 
XCCTCAAGAGGTAGTTCTGAAAATTTAACAAAATGTGAATTTCTTGGTAA 
AAAAAAAAAACCTCAAAAATATTCAGTTTCCTTTCCTTTGTGTCTGATGT 
ACTCCATCAAATACTGGGAAATATGTGTCTCTCATAGAAATGTCATGGAT 
CTTTGTAATTCTGATTATCCACAAACCTTGGGGATTAGCTGTTTCAATGT 
TCCTATTTTACAGATAAGAAAATGGAGCCTGTGGTAAGTTAAGTGAGTTA 
CTCATGGCTACTTAACTAATATTTTACTAGGTGATAGGCCAGAGCTAGAG 




TAGAACTACCGGT TTGT AATGAATTCCACTTGTAAATGACTGACCATTCA 
AGGAACAAGTGTTTTTTCTATGCTTGACACCTGTTTTGGATGCCAAAAAG 
GATACAAATGTAACTTCAGACACTCTGGGCCTCATTTTGCACTCATTAGC 
ATGTCCAAAATTAAAAAGACTGACCACACCAAATATTGGTGAGGATGTGG 




ATTTG7CATTGCAAAAAACAGAGACAATTCAACGTTGT 
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GATGAGCAAGCTGTGGT. JT CTATGCA,. . GGTATCCTACTCAGCCAG; 
AAAGATATGGCTAAT 

>Concig4 l 

GACAACAATGTCATGCATAAGATGACGATGGCCTGGGTGATTGATGCAAA 

CAAGG AT AAAGAAAAT AAT CAATTTT GT C C C CATTTT CAAAG ACAGATAG 

CAGCAGCAAGAGTGTAAGTCTGAGGAAAGTCATATTCCTTCCTCCTACAA 

"AXAGCACACACACTTACAAAAACAATACACAGACTCCTGGCCAATGGAC 

rrCAAAACTGAGGAGGATCATTAAATTTAAATGTTCACCGCTGCATGAAA 

"-TCCCGGGTCCTGCCCTCCCTTCCCCACCCTCCTCCACTTGGGCCGGG 

3CACAGCAG7GATTCTCTCACCTCTCAGAGTGAGCCAGTGTTGGCTGCAT 

-GAAGGCTCCAGATATGCAAACAGGGCAGATATTCCTGGACCAGGGTGCA 

ZAGAGTGAGGCTCCAACGCACCCTATTAACTGCATGAAGGATGAATGAGC 

C^CTGGTATGGGCTGGGACAGAAAAAAGGATTCAAGGGGCCCAAAAGGGT 

TTGGG-GGAACCTACCAGGAGCGGCAGTACAGACTCCTTGGGAAGGTGGC 

C ATG ATTT AG C C ACATT CAC CAAT AGG ATAAT CTGGAGAATTT C CTAGCT 

TGAGTTTCTGGGAGAAAGCAGATTTCTGGATTATCTGGTGACAGGTAACA 

GGGCCGAGTTCATCCACAGCCACCTGCAGTGTTAGCACCTTAAGCTGAGT 

^CCTTGCACCAGGATGCTGTCACGCCCAGTCAGTGTGAGACGGTTCTTGG 

CTGAAGGACTGAAAAGCTTGGGTAAGTGACTTCACCTAAGCCTCTATCTC 

'TGCTCCCGTAAGTCAGGGCTCATTGTGGCTCCTTGCAGGCTTGACTTCA 

GGGTT AAC AG AGAAAATGAAGGT ACAAGTGC CTT GTGAACT CTGAAACTC 

CAAACCAGTCATTCTCAAAGTGCCGTCCACCAGTCTAGCACATCAGCATC 

ACTGGAAGCTTG7TTGAAATGTAAATTATCAGGTCCTCCAGAGCTATGTA 

r GAATTAGAAACTCTGGGAATGGGGCCCTGCAATCTATTTCAACAGGTCC 

■"CCAGGTGATTCTGATGCAAGTTAAAGCCTGAGAAACTCTGTCCTATACA 




r~AGAATGGGATTGCTUTl.i-AAi. iuv.lh.i-i * v^<- »■ 

ATAGCCTGGGCTGGGAACCACCAGAGGATTATCTTCCATTGACCZAAGCTG 
ACAAACTCGCTTAAGGCTCTGAGTTTCACACTTGATTTTCTAGCCCCTGT 




^GGATGCACTAATU i^jv-c uuwn ouv- j. uwvuv^w, — . ,"7,,^ 
T GGC C ATAGGTGTGCAGGGCTGCCTGC CATGT ATTAAT AGCTACAGATT i 
GAAAGATCCAAGGACAAGAGACTAGAAAAAAATTTAAAACAGCCAAGCAT 
GGCCCAG7AATGGCATTTCAGAAATCCACCAAAATATTAAGATGCTTTT 




^■"GACATGTGTTGGQTAi- r 1 Uv_ iijioouuiuuvnv.i'»~^"" --- 

~T T TAGGAGTTGACTCAAATGCTCCCTGTCATAATTATGTGAAAAAATAT 

AATTATTAGCTCCATGGTACAAATTAAGGAGAGGTTACATAAATAAAAAG 
GAATGATACTCAAATTAGTAACCAGAGCCCATGCTCTTAAACACTATGCT 
_t r?__ tr . Tr i a ararTACATTGAC 




^C^GTCCACTTCCAGCC^tiAl'jAAij ihlaauai - - 

C G CAT™ AAACAACTTAGGAAT CAGACAAAATAT ACAAAGCATTGTTTGTT 
ACACATTGGATAACAGACAGCACTAGATAGTCGTGTCTGAGAAAAGCGGT 
GAAATGAGCTGAGTCTTAGAATTGCCCCAGTTTACTAAGGGGCATAGTAA 
GGGCATAGCTGCAGCACAAAGAAGCAGAACCCAACAGAGACTGGCGTTCA 

GTAGGCAAGAGTATCAGAGAGGAGAC^GTTGTAGGGAAAAAGAGAGCTTT 




rAGC"GAGTACCGAT(-AljiU<-Ai«w\xo in«««f»in» 

GTGGAAAAGAACAGAAGGAATGATGTCCAAAGCTCACCCAAAGACAGGAA 

TCATT^ATGTTTCCACCAGCCAGAGTGGAACAACCTTGTAACGCATATG^ 
AGTACTC^AACGAATATTTCCTCAATAATAAGTTCAAATTAACTGAGACT 
Avj i av- * ^-^""-^02 „„,,__„ . n.^rrTi Jir a a artTTTTGAGGGAAGC 
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tggcgggaaaatatta: -jataatgaa,. .jaaaaagcaatcaacaga; 

aggcctagaaagtatacatatgataaaattagcagacattaaatggttat 

gattaatttattttatatgttaaagaaggtagagaagagcataagcacat 

t aaag agagac aggaaagtc c cagtactcacac agggc caggagcagttt 

tcaccagtcaggtgggaaaacttcatatt7catggagcattggtagagta 

cacagtgtcttgccttagtagagggataaatgctgttctgttcccgccta 

acccatcttgaaagaaaatctgaaaggatcaaactgtattcaagtaacct 

aatcacatcccagcacacagctcgactagttataaaaacacaaaatatta 

atatctagaaacacaaaaataatatctagcacccaacaaggtaaaattca 

caatgtctagcattcaattgaaattttctaggc catcaaagaagcagtaa 

aatatgaccrataaggccgggcacattggctcatgcctgtaatcccagca 

ctctgggaggccaaggtgggtggctcacccggaggtcaggagttcaagac 

cagcctggtcaacatggtgagacctcatctctactaaaaatataaaaatt 

agc c cagcatggtggtgggcgcctgtaatcc cagctactcaggaggttga 

ggcaggagaatcgcttgaacctgggagaaggagaccgcagtgagccaaga 

tggcaccaatgcactgcagcctcattagagaacatcgggaag 

>Concig42 

GAAACTAAAGGCTTATTTAAAGCGCGAGACCGTGGCGCCTTTGGACTGGA 

CCCTTTCTAATGATCATTTAGTATCAGGCTATGTGGGAGTTGACCGTTTT 

GCATAGCCTGAAAGCCAACAGTATCACTCCTCCTCTAGGTGTGGCAGAGA 

TGTGAGAGAAGGAGACTGACAGTCTGTGGGTGTGTATGCAGTGTTGGGGG 

AAGCGAGGCACAGGGGACAATACTGTGGTG7AGAAAACTAGTCTAAGGTA 

GCATCAGGAAATTCATGAAACCAAAATGAATTTCATAACAGCACAAGACA 

TTATTTGTTTTTGCCTCCCTCTCATTT T ' l"rT TTTTTTTTGAAACAGAGTC 

TTGCTGTGTCATCCATGCTCGTGTGCAGTGGTGCAATCTCGGCTCACTGC 

AACCTCCACCTCCAGGGTTCAAGCAATTCTCATGCCTCAGCCTCCTGAGT 

AGC7GATTACAGGTCTGCACCACCCCGCCGGCTAGTTTTTGTATTTTTAG 

TAGAGATGGGGTTTTGTAATGTTGGCCAGGCTGCCCTGTCATTTTTTTTT 

TACTAGTGTCCAGTGGAGTTTTTTAGGGGCTACATAACATGATACTGTCA 

TTAATCTAATGGCTAATGAAAGGGATATGTATATGTTTTTGTGTTTAAAA 

CAAACTTCTTTGGGGTCCTCAATAATTTTTAAGAGTATAAAGGGGTCCTG 

AGATCAAAGAGTTTGAGTTCTGCTGGACTGGGACAGTGGTTGTCAACCCA 

GATTGTACATTAGGGTCATCTGGGAAGCTTTAAAATAGTACTGATGCCCA 

ACCTTACCGCAAACCAATTAAGCCAGAATCTCTGTGGATGAGAAGTCTTC 

ATTGTCATCATCACCATGACCATCATCATTGTCACCGTCACTACACCATT 

ATCATCATCATCATATCATCTTCATTATCATTGTTAGTATCTCCATCACC 

ATCATCAGCATCACCATTATTATCATCATCATCATCCCCACCATCATCCT 

CATCGGAACTTCACCTGCATGGAGGACAATCCACTATGCATTAGGTGCTA 

TGCTATTTGCTATACTCCTTATTCTCACAACTGCCCAGAGAGGCTGATAT 

TATCTCACTTTATAACAGGAGGAATCTGGATCGGAAAAGTTAAGGTAAGC 

TAATTCACAGAGCGAGAAGAGATAGAGCCAGGATTCGAAACCAGTTCTCT 

GCTACATCAATGTTCCCAGTCCTTGCACTATTGAGAACCTCTTTAGTTAT 

GCTTTCACCCCTCCAACACCACAGTAAATTT TT TCTTTTTTT A AAAAAAT 

TATACTTTAAGTTATAGGGTATATGTGCATAATGTGCAGGTTTGTTACAT 

ATGTATACATGTGCCATGTTGGTGTGCTGCACTCATTAACTCGTCATTTA 

CATTAGGTATATCTTCTAATGCTATCCCTCCCCGCTCTCCCCACCCCATG 

ACAGGC CCTGGTGTGTGATGTTCCCCACCCTGTGTCCAAGTGTTCTCATT 

GTTCAGTTCCCACCTATGAGTGAGAACATGTGGTGTTTGGTTTTCTGTCC 

TTGTGATAGTTTGCTCAGAATGATGGTTTCCAGCTTCATCCACGTCCCTA 

CAAAGGATATGAACTCATCCTTTTTTATGGCTGCATAGTATTCCATGGTG 

TATGTGTGCCACATTTTCTTAATCCAGTCTATCATTGCTGGACATTTGGG 

TTGGTT C CAAGTCTTTGCTATTGTGAATAGTGCCACAGTGAACATTCATG 

TGCATGTGTCTTTATAGCAGCATGATTTATAATCCTTTGGGTATATACCC 

AGTAATGGGATGGCTGGGTCAAATGGTATTTCTAGTTCTAGATCCTTGAG 

GAATTGCCACACTGTCTACCACAATGGTTGAATTAGTTTATAGCCCCACC 

AACAGTGTAAAAGCATTCCTATTTCTCCACATCCTCTCCAGCACCTGTTG 

TTTCGTGACTTTTTAGTGATTGCCATTCTAACTGGCACCACAGTAAATTT 

TTATAGATTTTATAAGCAAATTGTATTTACTGTGCAAGAATTGGTTTATT 

TTTTAAACCATGTGTTGCAAACATACAATGGTTAATTGTGATATTTGCTC 

AGTACAAGATCATCAGATCACTACACAGACTTGAGGTAATTCCACCTAAA 
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AGATCTTCAGAAAAAGAATGTGGCAAGCCANGTC . CACTAATGCAAATC 
CTATAACAAC7GTATCAGTACT 

rS--^TAAATATGGACCGATAGATGAATACAGGTAGGATGG^CAC 
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CTGTGGGAGGGGAAAG1 « JTTCAGATCAj. CTACAGGACAGCCACCTGGA 

CTC1AAATGGCTTACAGTTCCTTCATCCAGAGGGTCTTCATCTAGTACATA 
C CAGGTG CTAAGC CTGGGTG CTGGAG ACATGACGGGGAAC C CATTTACCA 

^GGCTTTGTTACTGTGACATTCACATCTAGGGAAAGCCAGCAAAGGGGAG 
3GATCGAGGAGAGCTTGTTAGGCAGAGAAAATACCCAAGGGCAAGGGAGA 
AGCCAGCCTGTTCTGAGCACACACAGTGGTTCCATC^AACTGGGCCTCAG 

tgccaggttggactggagatggggctgaggagctgtcacagSSctg 

n_AT j. GC - * xGAGTT l 1 rCCTTTTGTAATGTTGCCATTCATGACAATGTGf; 

aagatgggtccttgcagagaagggcagggctgtg^ 

rAAGATGTGAGCCAAGGAAAATGa^A^rar^^^«^Sii-ir^2E 



PCT/LS98/16102 




TCAAGGCAGCTGGTAAAGATCTTITATTACATATAAACTGQAATAAGCCA 

TCTGCTCCAAGACAAAAGAGTAGGCGGAAAACAATACAAGAttGAAATGG 

AATTAGAACAJ^CCTGGGAGX3AATGTGGAATTAGAGTA^AGAGTCCAACA 
CTGGCTGCAATCATAAAAATGTa a a in nr»i» .---»,,-.« ~~-~-Z2 




- -~" — iavjv. i^i Al iAAeTTCACTTGTGTTATGGCTT 

AAATGTGTCCCCCAAAATGTGATGTGTTGGAAACTTGATCCCCAATGCAA 
CAGAGTTGAGAGATGGGACCTTTAAAAGGTGATTAGGTCATAAGGGTTCT 
GCCCTCATAAATGAATTAATACTGTTATCATGAGAGTAGATTCCTGATAA 
AAGGATGATCTCTGCCTCCTCCCCACAGCCCTCT7GTGCATGCTTTCCTG 
CCTTTCCACCTTCTGCTATGGGATGACACAGCAAGAAGGCCCTCACCAGA 
TGCAGCTCCTTGATCTTGGACTTTCCAGCCTCCAGAACTGTAAGCCAAAC 
AAATTTCTGTTTATTATAAATTACCCAGTCrCAGGTATTCTGTTCTAGAA 
GCACAAAATGGACTAAGATCATTAGATTATCATTTTTTATCAGACTGTTG 
AAGTGAAAAATAAAAATCAAATAAAGAAATTAAGAGAGCTGCATGCAGCA 
GCTCATGCCTATAATCCCAGCACTTTGGQAGGCCAAGGCAGGTGGATTGC 
CTGAGCTCAGGAGTTTCAGACCAGCCTGGGCAACACGGTGAAACCCTGTT 
TCTACTAAAATACAAAAAACTAGGCCGGGCGCGGTGGCTCACGTCTGTAA 
TCCCAGCACTTTGGGAGGCCGAGGCGGGTGGATCATGAGGTCAGGAGATC 



GAGACCATCCTGGCTAACAAGGTGAAACCCCGTCTCTACTAAAAATACAA 
AAAAAATTAGCCGGGCGCGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGG 
GAGGCTGAGGCAGGAGAATGGCGTGAAACCCGGGAAGCGGAGCTTGCAGT 
GAGCCGAGATTGCGCCACTGCAGTCCGCAGTCCCGCCTGGGCGACAGAGC 
GAGACTCCGTCTCAAAAAAAAAAAAAAAAACTAGCCAGGCATGGTGGTGT 
GTGCCTATAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATTGCTTGA 
ACCCAGGAGGTGGAGGTTGCAGTGAGCTGAGATCATACCACTGCACTCCA 
ATCCAGCCTGGGTGACAAAGCAAGACTACATTTCAAAAAAAAAAAGAAAG 
AAAAAGAAAAAAAGAAAAGAAAAAGAAATTAAGAGAAGGGCAGGTATTAA 
CCCCAAATATCCCACCATAGGGACACATTAAAGTTTGCTTGGCCACTCCC 
CTAGCATAATATATGGAATGTCTTCAAGGACCCTCTGTTGTAAATACAAG 
GCCCTGCTGGACTTAATACAACCTGCAGGCTTTGAGATCCCTACTCTGTT 
GCCATCTCTCATAGGATTTGCAGACCAAATCCAAATACTTAAAATAGCAA 
CACTCACAAA^GOVAATCAGWXAGAJU^gAAACTTCTAAAAjGGCCCT 
GAAACTACACTTTATGAGAGAAGACAATAGGGACCItSAGGGTGGTAGAAT 
TTTCTCT CTATG CATCTATGTTTCCAXX^CTCACTTTCTCAATAAACTCT 
TAAATTGCTTTTAAA^TAACGCAACAAGCAAACATTACATTTAAGAGAAA 
TCAATTTCATAAAGAAGGGGG^SATGTCCAGGGTACTTTGCTTCCATGTTT 
TGCTTCCATGAATTTGTGTTTAACAGAAGATGCAGAAAAACACACAATTA 
TTGCAAAATCAAGGAAATCCACTCTAAACATCCCTTGGTTTCCCAGGCCA 
GTGTCACAACTGAAAACACATATTGTGGCTAATTATGTGTCACAAATTAG 
AATGACAAGGCAAGAAAAAAAAAACTCTCTGATTAACTAATAGCAGCCAA 
CACAGACAGCCTGTGTAGCTCGACTCTGCTGGTTTATAAAAGGCAGAAGA 
AGCAAACGGCTTCTGTGACCGCAACAGGAAGGGCCTCTGCTCTTAATAAA 
TAAATAACATTTAAATTATTCTCCCCCATTTGCAAAGCATTTTCCAACTC 
ATTATCTCATCTGACCAGGTATTATTGTATCTGACCAAGAAC7TGTATAC 
NAAATAAAGAATAAAAAATAAATATGGGCCANGCACAGTGGCTCATGCTT 
GTAATCCCANCACTTrGGGAGGCCCAGGCGGGTGGATCACTTGAGGTCAG 
TAGTTTGAGACCAGTCTGGCCGACATGGCGAAACCCCGTCTCTACTAAAA 
ATACAAAAATTAGCCCGGCATGGTGGCACATGCCTGTAATCCCAACTACT 
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TGGGAGGCTGAGGCACG^OAATTGCrTGAACTCGAGAGGCGGAGGTTGCA 

GTGAGCCGAGACTGCGGCCATTGCCCTCCAGCCTGGGCGATGAGAGCGAA 

ACTTCATCGAAAAAACAAAAACAAAAOVAAACAAAAAACACCTTAGAAGA 

AGCGTTCCTCCTCTTGCTTTCTGAAGACACTCTACGCTGAAACAGTAACT 

TTCAATAAACCATCTCTTCTCACCGCACTCTGCGACTTGCCTTGAATTCC 

TTTGTGTGCAAGATCCAATAAGCCTCTCTTGCGGTCTGGATGAGAACCCT 

^TTrTTGGAATACTCTGACACAACAAATTGCAGAAAGAAAGTCTCACATG 

TATAAAATAAGCAAAAAGATTCTCTGGCATCTGAAGAAACAATTTCCTTG 

T C AATATT AGTATCACTAT AAGTGTAG AACAAC CTGTTGT ATGATG CTAC 

ATAAAGTATATGAATCTGAATACTGTTGGATACAAAGGGAGACTATNNAA 

TGTAATACGTCGCCCGAAATGACTACACTGTTGGTGATCTT TCTTT CAAG 

AAGCANAATATTGCCTCNAACATCCTGTACATGGTATAAAATTTTA 

>Contig44 

CCCAGCAAGAACACCAATACAACGGGGGGGGCGTTCTTTGTGAGGGGTGG 

GGAGGTCAATTTTTTGGAACCTGCAGCAGGTAACACACAAAACTTCCACA 

GCTGCTACCAGCTTTCCAGGAGAGCCTGTGTACCTGGAGAGGRGAAGGCA 

AGTGCTTCCGAACTTGACTTGATGTCTTAGATTCTGCAATGCGTAGTCTG 

TAGGGACAGGCTGTAGCTTATCCTATAGGCTTGGGCTGGAGTCAGCAAGC 

ATCTGGGCTGGCAGAAGATAAAAGATGCAAAGGTGGAGGAAAGCATACGT 

GGTCTGGAAGACAGACTTGGTGGGTGGGTGGCTGCTACAACACCCTAGTT 

AGAGGTAGAGGGGTAAGTCAGTGTGTCTTCTGCACAGGCCTCTTCCCCAC 

CTCATTCTTCATTTCCCATACAGCCTTGCTGAGTTATTCACAAACATCTG 

ATTCAAC7GGAAGCTGGGTTGAGGATGACCTAAAGGACTAGTG7GATGCC 

TGCCCAGGGGTGTGGGCCCATAGTCAGAGTCCAGAGCCTCCTCTCAGCTT 

TTAGCAC AT CT CAC CCACATCCTGGGTCCTTAATTAGCAAT ATGAAAGCA 

AGCCAAGTGACAAGATTTTGTCCCTGGGAAGTCCAGAAGCACTCCTTTTC 

TCATTTGTATAAGCATAATGATTTGCTTACATAAATAATCATGAAAATTC 

AAATC CCTCTCAGAAATCAGGTCATAAAACCATGAAATGCAGCATGTGGG 

CAAGAATCACAGGGAAAGGTAGGTCTTGGAAAAGAAAGGATGGCAGGGAG 

GAAGAAAGCAGGGTGCCAGGGGCCCTGGGCTGCTGTCCAAGTCAGGTGGC 

TCACCGTCTCTGAGAACATrTCACTTTCTGGTAAATGGGGCAGTTGGAGA 

TAGAAGGGTTGGGTGAATGCCAAGAGTGAGCACAGCTGAGGTCAGTGCTG 

TGCCTGCAGTCCAGGCGGGAGTAGAAATCCTGGGCCCATCTTACCTCCGA 

CCTCATTTCCTCCTCTGTAATAATGTGGGGGTGGGGGAAAGTTCTGGTCA 

TCAGCCCTAGCATTC CATGGTTCATTT CCTCAT CAGTGATGGAAAATCAC 

CAAGCAAGAGAACAGGATGGAGAATAACCGGATGGGTGCAATCGGAGGTG 

C~ATTTCAGGTGAGGTGGCCAGGGAAGGCCCTCTGAAAGGGTGGCTTGAG 

-AGGTGGCTGAATGTACAGAAGCTGCCAATCATGAAAGATCTGGGGTACA 

GCATGCCAAGCAGAGGAAATGCGAGTGCAAAGGCCCCGAGATTGGATGTG 

GGCT"AGCACAAATGTGGCATGGCAAGAAGGCCAGTGTGGCTGAAGCAGC 

ATGAACAATGGGTGGAGGGGCTGAGAGGACAGAGGAGCAGGAAAGAGCCA 

GGCTTGGGTAGGAGAGGTGTCAACTTGATATATGATGCAAAGCCCTTGGA 

GGTTCCCAAACACAAAAGCAATGATCTAATATATGGTTTTAAAAATGCCA 

CTCTTGGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAG 

GCCGAGGCGGGTGGATCATGAGGTCAGGAGATCGAGACCATCCTGGCTAA 

CAAGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCCGGGCGCG 

GTGGCGGGCGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCAGGAGAAT 

GGCGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATTGCGCCACTG 

CAGTC CGCAGTCCGGC CTGGGCGACAGAGCGAGACTC CGTCTCAAAAAAA 

AAAAAAAAAAAAAAATGCCACTCTTGCTGTGAAAAATTGACCCTGGGGGA 

AGGAGGAGTAGAAATGTCAAAAGTGGAAGCAGACCACTCAGGAGGTCAGG 

GCAATGGACTGTGCAGGAGAGACTGACATCTTAGACTCGGGCAATAGGAG 

AGAAGGTGGTGAGGATTATATTCTGGGCATAAAGGCAACAGAACTAGCTG 

ATGGCGTCAACGTAGGAGATGAGGGAAAGAAAGAAATCAAAGGGCATTCA 

TAGGTTTGAGGGTTGAGTAACTGGGGATATTTAACAGAAATGGAGAAGTC 

T GGGGAAGGGGCAAGTATTGTGGGGGCAGGGGTCAAAAGTTCTGTATTTT 

GGCCAAGTTAATTAATATTTGAGATACCTCTTAGGTGTCCAAGTGAAGAT 

GTCAAACAGTCAATTGAATACAAAATCTGAATCTTAGCCCAGGATGGTCT 
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ACACACACACACACACA. AAAAAGTCA x wCAGGCATGGTGGCACATGC 
GTAGTCCCAGCTACTCAGGAAGCTGAGGCAGGAGGATCACTTGAGCCCAT 
GGTTCAAGGCTGCAGTGAGCTATAATCACATCACTCAATACTACACTCCA 
GCCTGGATGACAGAGAGAGACCTCATTTATTAAAATAAAATTTAAAAAAA 
TTAAT7AAAAATAAATC CAAATCTTTCCTG AGATTCATATT CAGGAGTAA 




^^wvwwjAwwtnui 1L1 ^jUAUAtjAAUACsAGAGCAG 

CAAAAGGTGGGTGAAGGAGAATGTGGTCCACCAGGCCCAACAATGCTGAG 

CAGTTGAGTAAGTGAGGACCTGGCCACTGAATTTGGCAAGAAAGAGGATG 

TCAGCGGCCCTAGAACAAAAGTGAAGAAGAGCTTGAGGACGGAAGCCTGA 

CAGGAGTGAACTGAGGAGAGAATGAAAGGTGGAGACATGGAGCCAAGGAG 

CACTGAGACTCCCTTGAGTAGTTTTGCTGTAAAATAAAAGTGAGTGCAGA 

GACGGGGCAGGGGGACAGAGAAATGCAGGGGTAGCTGGAGGGAGCCACAG 

AATCAAAAGAGGGTTTTTGTGTTTAAGATGGTAGTTGTCACATAGCACAT 

TAGTAAGTTCATGTGAATCACAACGTAGGTGAGACAGATCACTAATGCAG 

GAGTCAAATCCTTGCAGAGCCCCCAGAGGAGGTGATGAAGGGAAGTGATG 

GACAT.CATT CAGATGCAAGTAGGTTAGCAATTC CTGGGGT ACAAATAGGA 

GGTGACTCCTTTCTGATTGCTCCTGTTTTCTGAATGAGATAGCACATAAA 

GTCCACTCAGCCATGTTAGCTGTTGAAGTCCT7GTGGCTGTCATGCCTGT 

ACAGACTGGGCTCTCCTCTCCAGCATTTCCTCTCAGACTAAGCTGAGCTG 

CACTAGCCGCTGCCACATCCTCTTGGGGCCATCCTCTGCCACACTCCACA 

TATTGCTGTGGTTTGCTTGCAACCCCTGGAAGGTCCTACTGGCTGCTCCT 

AGAAGAGTCTGGGCGGCATCTCTCCCTTACTCGTTATCACATGGTGCTGT 

AAGCAGTGGCCACACACTTTAGCTGGTGGGATGGGCCATCACAGGCAGTA 

AATGCGAAAGACTGCTCAGATTTTAAAGCACCCATGAAtCAGTAGAATGA 

GTTTAGAATTGTAGTCATCAACACACATTAAAAAAAAAAAAAACAGGCAC 

TAAAAAAATTAGTTGAGTAGGATAAAGC CATAAAAGATATTAACTACAAC 

CCAGATAGGAGGTGCAAAATTGTCCTTACATAAATCAGATGGAAAAAGTT 

GAAAGCAGATAAGATAAAATAGGTAAGCATGACATTTAAAAGGTATTCAT 

GGGACGTGGTTACAAAACCAACTCACAACTAAAAAGTCTTAGGACCTCTC 

GCTGACTTAGGAGCCTGATCCCAACTTTGAGAATGACTCAGTGTGTTACC 

CTGTGGCTAGTGTAGACCAATGATCCTGTCTCAGAGTCACTAGCCAACAG 

CCCATATCAAGTAATTGAAACTTTGACTCAGAAACCTCAGTGTCAGAACC 

rTTGACTTAGGAACCACCTGTAGTGGTTAACTGCAATTTGCACCCCTTAG 

TTCAGGGCTTTACAACACCGGGGGCGGGGAGGGGGAAGGCATAGAGCTGA 

TGACCTAAAGGAAACCCATTGCAGCAACGCTTTTGTGTTAAGTTTACAAA 

TAAGTGTTGTTTTAGAATCCTCCAGGTAATGCCTTTGTTATTTAATGTGT 

CTGAGACAATTCTGCACATTAAAGAATATAAAATATTACCTTGTAATTCC 

AATTTGAAATGTGTAATTGACATTAGACTTCTATTTTAATTTGAAATGTC 

TAAAACAATGT OTTTAA GTTTGTAAAAGGTGTGTGAATTTTGAGTCTGAT 

TTACTACATTTTTTTTTAATTTTClUllUU N ri"rGGAGTTTTAGGGATTGC 

TTAGATGGCTAGAAAGATCGCTAGGCACATGTCC 

>Contig45 

GATGTGTGTACGTGTGTGCAAATACCGTGCCTTTTTTGTTTTCTTTTGTT 

GAAACAGAGTCTCACTCTGTCGCCCAGGCTAGAATGTAGTGGCGTGATGT 

CAGCTCACTGCAACCTCCGCCTCCCAGGTTCCAGTGATTCTCCCGCCTCA 

GCCT CCC AAGTAA CTGGGATTACAGGCGCCCACCAACACGCCCAGCTAAT 

TTTTGTA1 ri'TlAGTAGAGACGGGGTTTCACCATGTTGGCCAGG 

TCTAACTCCTGACCTCGAGATCCACCCACCTCGACCTCCCAAAGTGCTGG 

GATTACAGGCATGAGCCACCATGCCTGGCCAATACTGTGCCATTTTATTA 

TCAGGGACTTGAGCATCCATGGATTTTGGCATCCATAGGGGTCCTGTAAC 

CAATACTGCACAAATACCAAGGGACAACTGTATTCTAAAAAGACCAAAAA 

TTAATAAGCAGGACGCTGAAGGTAATTGCCCCAATAAAGTCATGATCCCT 

TGCCCAGTGTCTGAACCTCAGCCAGTTTTCATACTCAGGACCTATTGGCT 

GCA GAGG TGGTAGGAACCATATGAGAATCCTGCAATATCATGGCAAGTAT 

G CACTTT AATGATAT CTGCAGTC C7TC CC CAAAAGGACCTTACATTTACC 

A TACTGC TATGTCCTGCGTGAGAGGGTAATACTCAGATTTTTT TT " l"l TTr 

TTTTTTTACACAACGTCTTACTGTGTTGCCCACACTGGAGTGCACTGGCT 
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CGA7C77AGG7CAC7GC .C77C7G7T1 - JTGGGCTCAAG7GATTC7C: . 

GCC7CAG777CC7GAG7AGC7GGGA77ACAGGCGCCCGCCACCA7GCC7G 

G C7AA7T7T7G7A77T77AG7AGAGACGGAG7777GC CATGTTGGC CAGG 

C7GG7C77GAAC7CC7GACC7CA7G7GA7CCGC7GGCC7CCCAAAG7GC7 

GAGATTCCAGCGTGCGCGGCCATACCCGGCCGGGAATTCTTTATATATTC 

TGAAAACTAATCCTTTGTGAGACATAAGTGTTGTAAATATTGTATCCCAG 

T7TG7GGCA7G7A77TT7AA77777AA7GG7G7C7C7CAA7GAAAAAAGC 

T7AACAC77AAA7GAGG7CAAA77GA7CACC7777TA777A7GG77GA77 

CCTTTGGTGTCA7G7GTAAGGAATGTTGTTCCTTCCTGTCCCAAAGTTGC 

AAAGA777C77G7G7A7777G7CC7AAAAG7777AAAGrT77GC7777CC 

CA7C7G7GCACA777CACA777GC7ACA7C7CAC7GAC7GC77CC7C7GC 

TGCAGAGCAAGC7CCA7GAGAGCAGGAGGCA7GGG7CC7GC77C77G77G 

G7CCCCAGAGCCC7ATG7CATGAC7AGGACC7GGCAGGGGAC7AGTGAG7 

AGC7CG7GAC7AAC7GAC7CAA7GAA7GAA7GA77GGA7GA77GAACAAA 

G7GG7A7GGGAG77CACAGCGAG7AAGAGA7GCC77AGAAGAGA7GAAGA 

AGGAGA7GG7A7AGGG7AG7GGT7C7CAAT7C7GGG7CCA7GG7GGAC7C 

ACC7GGGGACCC77AAAA7GTACCG7GGAGGA7CCCAGCCCAAGAGA7TC 

TG7A7GAC7GG7C7AAGA7GTGG7C7GGGCACCAGG7GATCCCAG7G7GC 

AGCCAGGCC7GAGGCCAC7GGA777GG7GG7AAA7GAGG7AAC7A7CAAG 

GG7ACAGACG77GG77GCCAACAGGC77GGGC77GAA77TAAGC777G7C 

AC7GAC77GC7G7G7CC7CC7GCAC7CG77GAGCC7G7777C7CAGC7GA 

GAGA7GGG7G7GA7AACACC7ACC7GC7G7AG77G77G7GAGAG77AGAG 

GAGA7 AAG CA7G77 CC7GGAA7GAAG7G7G77 G77AA7C CA7CA7AGG77 

T777GG77G777G777G77TG777G777G777777CCTrTrCAAGAA7GA 

GGTTGAGCCAGACTTTGACAGCTGGGTGGGAAGTGAACATGTGGTGATTG 

GGAGAGAAGGGCAGTT7ATGTGAAGGGAA7G7AA7AA77AGAGAG7GGGC 

G7GGGAAGACA7GC7GGGGAGAG7GAGCAGGCCGG77AGCCC7GG7AGAG 

GG7GCAAGAGAGCAG7GCGGAA7C7GCCAGGGAGACAGG7GGG7GACCAG 

GG7GCCAAGGG7G7GGC77TTCCCAGG77CCCA7GGACACAGCCA7CC7C 

CCAGATGCCCAGCC7AGCTGTGAG7GAGCAAGAGTTC7GGA77GTC7CTC 

7CAC7C7GTC77TTTCTCTCA7TCCAGAAACAAAGCAGTGACTGGTACT7 

AGGAGGAGAA7CAGG7CAAGT7GGGAGAAACTTGC77C7GC7CAGGGGAG 

CAGAAGCAAGAA7GGAGGC CC CACCCA7GCTGGAAGA7GATGAGGG7T77 

GGTTCAGGGAGGAGGAA7ATTGGGGA7C7AAAGGGGCC7GGGAGTGGGGC 

AGGACCC7GCC77AGGACAGG7AGAAACA7777C7A7AAAAAA7GGGG7G 

GAGG77GA7GG7AGGACCAGGCA7C777AG77GGC7CCC7GGAG7G7CAA 

GCCC77GAGA7GGTCTTTAAAAGCCATGCAG7GGGGTTTGAATCTGGTGT 

~CAAGC7CA7AGG77A77AACATAA7GACAC77GGAAAC7A777GGGAGA 

GC7CAAG7GAG7GGCC7GGAAG77C7G7G77GG7GCAGGAGG7GAC77AG 

GA7G7GC7GC7CCAGAC7CATA7C7T7GAC7GCACACC7GA7GC77CA7C 

TGGC7A7CC7G7AAGCACC7TCAAC77AACA7G7CC7ACACAGAAC7CTT 

GA7A77CC7G77CC7CCCCCAG7TCC7CAGT7C77ACCAAA7G7TC77CC 

AGTTACCCAAT7GCTCAAGTAAAAAATCTAAGTCCTTC7C7TGGATTTC7 

GCC7G77CCC7CAACATCCCACC7ATCCATGAGTGT7C7GTGGGCCCTGC 

C7C7GAAA7AAA7CC7GCCTTTG7C7CCCAGT7CAC7CCAGCCACCCA7C 

C7GGGGC7GCACCC7CC7CCTTCCAAGCCC7C7CCC7TTCCTrCCTGGTG 

C7GCC7G7CA7G7CAAGCA7ATGCA7CAGTGCGACCAGGACA7TTGAAA7 

GCAACCAG7ACAATTGGGCGCGG7TA7GCCTACCAG77T7TCT7CCTTAA 

ACA7777A7A77TA7G77TGAAAGCA7GCCACCTTTC77CAC77GC CAAC 

T7GACAGA777A77AG77GACAACA7CCGC7GA7AGCA7CAG7AA7AAGT 

TAA77G77T77GCACA7G7AGCTT7AA77A77C7CA77A7CAT7TATAGG 

AG77AT7C777G7AAAGGG7AAC7GAG7TT7CCAAAACAAACAGAAA777 

GGGG7GGGCCCATGGAGCGTGAC7CA7GAAATCAGA77C77AGAAGGACC 

T CGGCAAG7C7C7GGG77GCTG77AA7GAGCC7GGC7GGC7GCCAGGGGT 

GTGTCTGCCC7TTATGAGGCCACCACTGTTCAAATGCTTGCCTGCA0CAT 

~ACT7GCC7AGG7AG7GC77G777C7AC7GAAC7G7CAGGGA7CCAA77C 

777G7GG7C7AAG7AACAA7AC7CAGA77CACAAGGAA77GA77AA7AAG 

"CAGAA7GCCAA7G7A7TACA77T77GA7GAAGACCATA7TrACAG7GA7 

~G7A7C7GC7CAAGC7CAAA77AGGA77AGAG77C7GACAAA7ACA7A7G 

7GAGAAG7A7GAGG77AAA7AC77GAAA777GGAC7777C7AGAAAA7C7 
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GAATGTGATTG C CATTCACATAC CTTTCTGGGGATGATGATTCTTGTAC' 7, 
TTTATTTTAAAAGACATAGAAAACTAACTTAAGAATCAGATTGCTTGGCT 
GGGCACAGTGGCTCATGCCTGTAATGCCAGCACTTTGGGAGGCCAAGGTG 
AGTGGATTGCTTGAGCTCAGGAGTTTGAGATCAGCCTGGGCAACATGGTG 
AAATC C CAT CT CTACCAAAAAT ACAAAAAAAAAAAAAAAAACAAC CAAAA 
AGAATAAATTAGCTAGGTGTGATGGTGCGTGCTTGTAGTTCCAGCTACTT 
GGOAGGATGAGGTGGAAGAATTGCTTGAGCCCAGGAGGTGGAGGTTTCAG 
TGAGCTGGGGTTGCAACAGTGTACTCCAGCCTGGGCGATAGAGTGAGACT 
CCG7CTCAAAAAAAAAAAAATCAGATTGCTTTATTGCTGGTTTTCTTTCT 
AAAACTGAGATTGGGTCCCATCATCCCCTGGCCCCCATTGGTTAATGGTT 
CCTCCTTTGTCTATTGAATAAAATACAGATGTCTGCTTTTGGCAACATGG 
TTGAATGTAGACACTGCAGGGTCTTCCTGACTCAAAATGATTTAGGCTTA 
GATAAAACACATTTGGAAATGC^TTTCTGGATTAACACCAAGGAAAGGAG 
ATCTCTTTAAATCCCCTTTCTGTTCCCCCCTCCCTACCCCCTCCAATTGG 
GCTTAAGTAAGAAGGGTGGTTACCCGCTAGTAAACCCCCTTCGAAGGGGG 
TCTTCTCCTCTAAGGGAAAACCCTTGTTTTGACATTTGCTTCSATGGGCC 
CTTGTATTTTGTTCCTTGCTAAACGGGTGCTAAACCAGGGGCCTCCTCTT 

>Concig4 6 

AAGGCTTTTAGAATATTTGCACACTTTAGAAATGGAAATGTTTTTGGGGG 
GCGAGTTGTCTTAATA TTTC ATTTTTCTAGCTTGTGTGACATCCTTTTGA 
AAGCAGCAATTCTGGCCTTTGTGAGAGATGG7GAATGCCTGCAGGTGTGT 
GGACCAGTGCGTCCCTTCCTTCCTACATGCACGGCCCCCAGCTGGGCCCA 
GCAGAGTGCTGTTACAGAATAATTTCCAAGGGCTGTGTCTCTAACCTTTG 
GTCTTGTCCCCCATTGCTGTAGATTTGGCCAATTGACTTCATAAGTGCCT 
CTTATGAACATAGATGTTGGCAATGGAAGTTGAGGACCAGTCAGTGGTTG 

CACTTTTTCCCAGATCAAAATGTCCCTGCAGCTACTCAGCAGAGCTGTCG 
CAACTTATACGTAGAAGAGGTAACAGTCCACAAACAGAAAGGCACAGGAC 
GAGAGTGGTCTGGGTGATGCTTCCTGTGGGGGAAAAGGTGATGAGGGTGC 
ATCTGCACACCTATGTTCATAGGTAAGTCTGGGAGGAGGTGACCTCCCCT 
TTGGTTGAGGTGCTGAGGCGTCTrGTTAGAATGGCACTATTCCATTTATC 
TGATGCAGTCTGTGGGAATTTTGTGGTATGGCCACCACAGGTACCATGCT 
GGGAACAATGCCAGATACTGCCTGCTAAGCCACAGCATGAGTCACATGAG 
CATTTGTGGGCTTTGGGAACTAAAGTTATTGAACGATAGTTATCTGAAAA 
GGAATTTAGGGAAAGGGGACTTTAGTCCAGCGAACAGTTTGCAAACCAGG 
GGGAAGGCAGCCTTCAGCGTAAAATGAAGACGTGTGTGCCCCAAATAACA 
AAGGGAGAGTTTGTCTTTTAGAGAGTAAATGTCCACGCAAGGTTCCACTT 
AGGCAAATGAAAGATGCAAACTTGCTTAGTTCTGATTTGTTTACATTTGC 
TGAATTCGGATTGGTCCGTGCAGGCTTTTCTGGGAACTCCAAATACATGT 
ATGACCTCTAGTCATACATGGCAAATGGCCGCTTGGCTCTAATTTGAATT 
TAGGCCCAGTTAGTCACTCAGGATTAACCTTTTTCAGGGTTCACAGCTCT 
GAACAATGG ACTTA GACCTGCAGGACATAATCTGTTCCTAACTCTGGGAC 
TACCTGTGCCTTTTGACTGTGCCCAGTGAGCAGCTGTGGCTCTGGGCCCA 
GACCCACAGGGCGATAAGGCACAGAGGTACGCATGGAGCAGGCTGTCCTT 
GCTGAGTGATCATGAAGATACACTTACATAGAGCAGCACTTTTCCTTCCA 
GTCTTTGTGATTTAACTCATTAGATCCTTATAACAAGAGTCAGTCCTCTA 
TTTAACCCATGAAGCACAGGTGGAGTCCAAGCTTAGTTTGTGAAGGATGA 
GCCAAAAGGATTCTTCTCTTGTAGACCTCAAGCTCAGCTCTCTCCATGGG 
CCCTGGAGTAGGTGAGAAGGCCTCTGTCTTCCAGAGCCCACTGCCAATCA 
TCTACATTTTCTGTTAGCCCAATTCTAGGACATTGCTTTACCAACTGAAG 
GGTGAGAACTATCATAAGTTATAAAAATCAATTGAAAAACAAAAAGGTAC 
AGAACAGAAAATAAAAGATGAGAATCTATTAAACATAGTGATGTTACTGG 
AAAAGGGGGTCTCAAACCAGACCCCAAGAGAGAGTCCTTGGATTTCACAC 
AGGAAAGAACTCAAGGTGAGTTGCAGGGTGCGGTGAATTGAGAGAGTTTA 
TTGAAAGCTATTCCATTACAAAGTAGAGCATCCTCAGACAGCAAGTGGAG 
GAACATGC CATCATTAAATTTTT CTTATATAGGAATCTTGTCTATATAAA 
GACTAAACTAAGCTGTGGCTATGTGTGGGTGGGCCGACAGCATGAAAACA 
TTTATTCTCCTATTGATTTAAAGAGAACTATCCTTGACATTTTAGTGTGT 
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~AAGTACATCAAAGCAl^CrATAAT7ArCTTGAAAGCATATATTTT:«. 

TAGGGATTGGGACATCTGGGCTTTCTGTTGTTGTAGAAGTTTGTCCTTGC 

AGGGATTACCAAGCCACTTCCTTAGCTGTAAACATCTTAGGGCCATGGGT 

CCTGACTGGCAAGGAATGTGTCTTGCTAGTTTTAAGATGGGCTTGATTTG 

AAAATGGTGTCCATCTGGCTCTCCTAGGCTCCTGCTTTCCTAACAGTAAG 

GGTAAATGCTATGTTATGAAATGTCATTTCTGCCTTTAGCTTGCAAACTC 

"^TGATGGTGAAATTCTCCTGTCCGTTTTCAGTGGGGTATTTATTCTGCAT 

CCACGTCTTCACAAGGAGCTGAAAACAAATTGGATGGAAGCAACTGGGTT 

TTATGGGACACGTTAATGTTTTAATGTCATTTGGTGTGGAATTCAGATGT 

~CAAGCAACATTT7ACACTACAAATCTGCAACT7TAATAATCACTCAAAG 

TACCTGAACCTCAATGCTTTCAGACAGACTTGGTATAAAGCCACCACCTC 

" r ' T "TCTATTATGGCAGCCCTATCCTGAGGACACAAATTTCTGCAGGGCTTC 

-GGCATATCTCTGATTAAACAAATGTCAACAAGGTTAAAACAAATGTCAT 

CTCTGATTTGTTTGTTTTAAAGCCTGGATTTACTCATTGAATATTTCACT 

CCTACTAGCATGTCTTGTAGTAGTTTTCTTCAGGGACCCTAATTATTGCT 

ATTAAAAATATGTGTGCAGCTACATGTTTTTTTTTTATCAAT9TGCAATG 

AAAACTTTAATTGAATAATCTATTAGTGTTATTATTTGAAAGTGAAATCT 

TTTCCTTTTGCTTTCTTGTTCTCACACATAGTGCAGACAGTTTCCACACG 

GGCTCATAAAAGGAATGATTCTGCCTTGTGTGAACTTTTTGCCTTTATTG 

TTAATTGCACCATTTTGTGACTGGCTTCTTGACCCTGTTGTAACCAAGCT 

CATAATGTACATTATTTCTTATTrTGCAGTTGTAGACACTrGAGGXAGTT 

"CCATTC"*TTGTTTCTTCTTGCTTTTGTTCCCTGTGATAACTTTTTCATG 

CAGACAT*" r TTTTTTTTTTTTTTTGAGACCGAGTCTTGCTCTGTCATC 

ZAGGCTGGAGTGCAGTGGCATGATCTTGGCTCACTGCAACCTCTGCCTCC 

CAGG^TCAAGAGATTCTCCTGCTTCAGCCTTTCTAGTAGCTAGGATTGCA 

GGCGTGCACTACCACACCCAGCTAAATTTTTCAAATTAGCCACCCCACCT 

GGCTAATTTTTGTATTTTTAGTAGAGACAGGGTTTCAACCATGTTGGCCA 

GGCTGGTCTCGACCAGGTGATCCACCCGCCTTAGCCTCGCATAGTTGCAG 

GTGCTATTCTGAGCTCAGGGCTCTGGCAGCTACAAGCCCAAGATGCGGTC 

TCCAACATGTGGCCATTCAATGTCATGGCGCCCTCTACrGGTCCTGGGAA 

GCGCAGCTCTGCCAGTAGCTCCAGCAGGGCACAGCTGTTAAGTCGTGATG 

TTCTACAGGTGACCAAAGGGCAATCTCTGGACTCCTTAGCCGCTAGGTCC 

TCTCTGTAGCAGGACCCAGGAGAAGGCAGGGGCTGAGGATGGCTCTCTTA 

GACATTTGTGATGAACCAAACGTGTGCATTCATGAAACTTCTGTGAGCAA 




GTAATGGTGTGATTGGGTTTGCGTTTTAGGAAGATTTCTTGGCCAGAATG 
AGGCGGGCAACCCAGAGCAGGGAGTGGCCACATGTGGGTGTGCAGTTATG 
GGCCACTAATCCAGGTGATAAATGGTGTCTCTGAACTTCAGGTGGGGGTG 
CCACATGTCTCCATCTGCTCTGTACCCTTGAGACTGGCCTTATGGGCTGC 
r TTAGTGGTCTGTTGTCCTCTATCTCCTGGTTGGGCTCAGGCAATGGGAG 




GGCCCTCCCACAGCCCCAGCTTTCTCAGTGGGCCAGCCACCTCCCCATCC 
^TTGC~CCTCCTCCTCCAGAGAGGGTTGTGGATTTCCACTGTCAGCAGTG 
C CTGGAGCTCCACCATCTCCTGCTGCTTCCTCTGGACCTGCCTGCAGTTT 
TATAAATAACCTTTCCTrACATTACCTCTAGCATGCACCTTrrGTGTGTA 
TAC^CTGCCCCCTGTCAGCACATGACTCATGCCAAAGAGTTTGAATTTTT 
-T CT C CAGGCAACGGGAGGTCATTGGAGGATTTTAGACATTGAGAACAGA 
TGTGTATTGTGGAAATATCTGTCTGACTGAAGTGACCAGGATGGTCCAAA 
AGAGCGAGAATTTGAGGCAAGCAAACCATCAGCAGGCCAGCAGCAGAAAT 




TGGTAACTATTTTTATTTTTATTTTTATGTTTTGAGACAGAGTCTCGCTC 




GAC~ACAGGTGCGCACCACTACACCCAACTAAGTTGTGTATTT!TAG7AG 
AGATGAGCATTCACGCTGTTGCCTTAGACACGG 

AATATTGATTATTTGACCAGAAATTCATGCAGCTAACCGTGACCCCTGGC 
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Xaaataaaatagtgtat,_ GTACGTGCA'iATACATGCAAAGAAATGAGV 

GAAACTAGAAGGATGTCAATCAAATGATAACATGGTCATCTTGGGGTCGG 

AGTACATTTGGGGATGAGGGGAGCTGTAAAAGCAGACTTGGACCTTTTCT 

T CTAC CAGT AC CGTGTCATTTGAATTTTGGAAAGAAAAAAAAAAACTCAG 

AAGG AGGAG AAGGAGCAGGAGGAGAAGAAGATGGAT CTTAAGTGATTTGC 

C C GGGAGCAC CTTGAGAAGGTGAGATTCAAGTCTAGGTCTAAGCTTTCTA 

ATT C C ATGAGTGGGAGTGACC CACGTCCAAGAGGAAGCT CAAAAGGAAGA 

TGTTCTCCATCATCTCTTGCTCATCCTAACAGCATGCAAAACCACATCCA 

ATGCAGCTCAGAAAACTCCCAAATTGCCAAATTTCATTGGAAACACTTAA 

7GCTGTGGTTTCCAATTTCAACTGTAAAGTAGGTATGTATGCCATTGTTA 

CCATTAACTTCTCAGAAATGGAGAGAGCTCTCTTTCCGCCTCCTCCCCCT 

CTGCTGTGGCTTTGGTGAGACGTGCACTCAGGCTCACCTGTCTCCATGAT 

CTCCAGTAAGTACACATGAGCAGAGAGGCCTCAGCTCAGCTCTTCCTGGT 

C C CAC CAGGGTTGATTCTTTGAGAATTCTAGAATGCCACATC CTAGGCC C 

CCCAAAGAAATCCTGCATCTTACCCCCAGAAATATGAATCATAGCAAATT 

TCAAATCAACCATCGTTTAATACTCACAGACTGGGCACATCC^AAAACAT 

ATTTTCAGTTTTACAACAGTGCCTGGTGCATATCGGCACTATTTGTGGAA 

GCAAT AAATCGACAC GGAG CTGAAACACAAACAAATGC CAAATTGTTTTT 

ATAACACCTGATTTTCTTTCTGTTTCTTTATGCAGTTTAGTTTTGTTTTG 

CTTAACTCTACCTCAGACCATAGTCTGGTAAACTCACCACCCAGAAGCTC 

CCTTGAAATGTGGGTATGCAGCCACTAGGTGGCAGGAGAGAGTTTCCTGC 

CT GGAGGGAGGACAGC CACTCTGTCC CCGGGTCAGGCCAGGGCCAC CCTG 

C7ACCTGCAAAATTAGCATGGGGCTTTATGAACCACAGCTTCCTAATAAA 

CACAGGATCTGTTTGATAGAGACTCCAAAACACGCCTACCTAGTGATGAA 

AGACTCAACTTCAGAAGAAAACCTTCATGGCAAACATCTTCAGAGATGTT 

TCCAACTTAAGGTTCTGAACACAGACGCTTCCCCAGAAAGCCATTGTTTC 

TGAGCACCTGGGAGCCTTGCTTTGCTTTGCTTACAGACTCGCTGTTCTTA 

AATCACTGCCAAGATAACATCTGTCTCTTCTCTTACCCTCTATTTCGATA 

TAAGGACTCCTCACTCTTGTTGCTTCCTATTGGCTACCTCTCCACAGGGA 

GAAATCGCTGATTTAACAGCAGTCAATATCCCAAATCTGGAACAGGGAAC 

AGGGAAGCATTTAAAAATTGGAGAATTTAGGCCGGGCACAGTGGCTCATG 

CCTGTAATCTCAGCACTTTGGGAGGTCGACGTGGATGGATCACTTAGGAG 

TTCGAGACCAAGCCTGGGCAACATGGCGAAACCTCATCTCTACAAAAAAA 

AAAAAAAAAAAAAAAAAAAAAAAAAACCCAAAAATTAGCCGGGCATGGTA 

GTGCACACCTGTGAGCCCCAGCTACTCAGGAGGCTGAGGTGGCAAGACTG 

CTTGAGCCCTGAGGTCGAGGCTGCAGTGAGCCGAGATCACACCACTGCAC 

TTCAGCCTGGGCAACAGAGTGAGACCTTGTCCCAGATAAATAAATTAAAT 

TAATTTAATTAGAGGATTTAAGGATTTTCCCTACAGACACCTCCTTATTT 

rCTCTGGCCTTTTCTGACTACTCTCCCTAACTCCCTGCTCCTCTGGTCTC 

CCAAAACTACTCCAGAAAAAAAAAAGGGGGGGAGGGACTAAAGGAAAGCC 

AGGTGACAGTGCCAGTGTGACAGATGACAAAGCATCTGCCCGAACAAACC 

ACCAG7TTTAGCAAAAACCCTCCTAAGCCAGTTTATCAAGATCCCCTTAT 
CCTCAATATCCATCTGATTGGATTCTTCATCCCCCACCATTCCCCAGTGA 
TGTCACCAGGCCTTTCTTCAGCAACAGTAGTTAGTGGGTGTAGCCAGGAC 
GCCCCCTCACCCCTGATATGCCCTTTTAGTAATTCTTCATCCACAGGTTC 
CCACCCTGCTCCTAGGCTATACATTCCQVTTTGCCCATGCTGCATTCGGA 
ATTGAGCCCAGTT CTATA CTGAGGTCTTACTTCACCTCTCGCCATAGTCC 
TGAATAAAATTGGTTTTCAGATTTAAAAACTGTCCAGCTCTGGTTGTTCC 
TTGACACAGGGTAATTTTTATTCCATGTGATAGTTTGCCTTACCTCAGCC 
TACACCCCTCAAACCTGCAACTCTATATTCAAGAACCAGACAGCCCTTTC 
CAACAGATAGGAAGAGGCTGCCCTGGTGCAAAGGAAGAGGCTCTGGGAGG 
AAGGAGAGAACCCGAAGGCTGCCCCCTCCTCTAGACTGAGCTCTGGGATG 
GGTGGACGATAAAACCCAGATACGTTTAGACATCTGAGCGTGGAGAGGAC 
TTTGCTTTGCTTCCACAGGGACCCCAAGGAAACTGCAAGCCCTCCAGAGA 
CTAAAAACAGCAGAACAGCAAGAAATGGCAGCAAAGGTCTGGGCAGAATC 
AT C CTATGTGGGCACAGACACAAACAGAGTC CC CTGTGG C C C CAGGAGAG 
TTTAAAGAAGATCCAGAGGCTGTCCTATTCCATATCTCAGCAGAGACAGG 
CCCGTGAGCCTAAAAGCTGATCATTAGGACAAGAAGGACACGAACTGTCC 
TGCAGCGTGAACCGCGTGGAACAAGGCCAATCACCAGACACCAGACCAGC 
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CAGACACAGCCCCGCAGVrCCCCAAGACCACCACGGACCCATCGCCCCTC 

ACCAATAGCTCCAGGCTACATAGACCCCCTCCACTTCATGGATGTCCTCA 

GAGCAGAAAGGGGAGGCAGGAGTGGAACCCTGACTTGGTTCAGTTGAAAC 

ATAAAATGACTGTACTATTATTGAATTGCTGAAG7TTACGTGAAAGAAAT 

GAGATT T AGTTTTTGGC CACAGTGCAAAAT AAGAAAC GAGG CTT CAACT G 

AGA~TAAGGTGAGTTA7AGGAAAATGTACTCCCTTGAAGGACCTGTGAAG 

TGIGTT CGCT ATG AGAAAATG ACCAGAATC CACGTTCTTAGCTGC GGGAC 

TCAGGCTGACTCCTGTTTCTGGAGCTTGCACAAAGGGCAGGGAAATCCCT 

GTTTCAGGCACAGTGATTTCAATGTTTAAAAGAAAACAGGTGGGCCCTGG 

CAATCATGATAACA7GTCATAAGTTTACATCTCTGTGAGGCAGGTAGTGT 

AATCCCCATTTTGCAAAGGAGGAAACCGAGGCTGAAAGCAGCTACATGGT 

CTC ^^ C ^ TG - GGCCCAAATGT TGGAGAACAGAGCTTAACTGAATCAGCA 

A77CrATACTTAGAACTGACTCTCTCTT7ATTATATCTCACTACTACCTT 

3ATATTT GAAATATT CAACTTTTTTCAATCAAAAAATAACAATAATTTAG 

3CATAATGACTACTATGTCATTTAATTTCTTGCTGATATTTCAATATCCC 

ATGC C AGGAAT ATTGAAAGCTCAGCT C CTTAAGAGCTGACTATGGCATCA 

ACTCCCAACAACCATCCTTCCAGAAATATTTTCCCCTTTCTTTTGTTATA 

GAGTGGCACTGCCCTATATGGTGACCACTTGCCACATGTGGCTGTTGAAC 

ACTTGAAATTGGCTTGTCAGAATTGCAGTGTAAAGTGTAAAACACATACC 

AAATT' T -CAAAGACATGGCACATAATAAAAAATGTAAAATATCTCATTAAC 

^XTT^ATATTGACTGTGTAAGTAACATTTTGAATATATTGGATTAAAT 

ACATGGATGATGCCCCAACACCCACAGTCCCTTATCAAGTCTCTACTTCA 

CAT77 T 7GTAC77C7GACTTAGAAA7AGCAC7GGCG7C7AAGAGCCTA77 

AATGTCGTCAATAGGTTCTTGGGAACCACAATTTTAAACAAAATGACATA 

TAAGAAAACGAATAACATTGAACAAAATGACATTATTCGAGGACCTGCTG 

CATGTTGTTTCACTTAAAGTCAGTGTCCAAGAACCTATCAGTGACATTTA 

GTGAGGACTTGCTGTCCTTCCTGTTTACAGGAACCTGGGCAAGTTACTTA 

ATTCCTCTAAGC CTGGTTTAT ATC C CTGCAAAGAGAGAAGGATAATAATC 

ACCAGTAC7TAGTGATGTCGTAAGGAGAAAATAAAATAATAAATATGAAA 

TGGCT GACAGTGT CCTTGTCACACAGAAGATGTGTGATC CACAGT AGCTG 

CTATTGTCTGCCTCACTTCACTAGTAATGGTCCAGGGAGGCCTTTAATGT 

GCATGGTGCAGTACATTCACATGTTGGACATGGGTGAAGGGAAAGACCAG 

GC T CAT C T AAACACAAT AGGATGCTTGTGGTGTTTTG AGG AGGAATCAAG 

GACTAGTTATCCACAGCTGTAACATGCATGGATCAAAAGAGATAAGGCAC 




-CTGGGCAAGTCACGCCATCTGTCTTGATGCCCTTCCCCATCTATAGAGA 

GGGAGCAACTGAGGCCCCTTCCAATACTGAAGTCCTTTATTTCTGCTACT 

"^AGAAATATCCACATTTTTGGTAAATTCAAATGATCCAATGATTCCATT 

^CC^AATGTTCAAAACTAGCCCCAGAAACATCTAAATGAATCAAACAAAT 

AAAATATTTATTGTGTATGTTITGATTGCTGAAAC7TCTATTTTAGCAAC 

ACACACACACACACACAGAACCCATAAGCCTTCATCTTrCCTTGGATAAA 

CGAGCCTTCCTGTCTGGCCATTTAAGTCACGATTAAGTAAATGATTTCCA 

ACTCGCCTTTTGCAGCAGTTCAGATGGGTCTTTCCTGCGTGGCAGTGGCC 

CTCCTGACTTATGATTTCCTGTGTGTCGGCCTGTTACCACTGCAGCTTAA 

CTGAGGAAACAAGAACAAAACAGCTTCTGACCCCAAGAGACTGTTGGAGG 

CAAAGGCT7CAGTCCCAAGAACCTCACACGTGGGGAGCCCGAGAGCCCAG 

r CCTGACCTTTTCTCCAGTAATAACATAAGAAACAACAGGCACTGGCCTT 

ATTTrGGATACAAAGAGTGGTGCTTTTCCTTAAATCTTCCTTTAGTCAGG 

GCTACCCCTTCATGGACGCCCCAACATCCATGGTTCCTGCTTGAGTCCCT 

GCTTCCATATTCCTGCACTTCTCACTTGAAATATCCCTGGAGTACGTTAA 

GCAGCCAGGTTTGGAAGTTCTTGCTGTGCAGGCGGGTGTGTGCATGTCCT 




CCAGGCCATGATGGGAAGCACTCTCTGCCTAAAGACTAGGGTGATGCGCC 
C^CAACTGTGGGAATGAGCCCCAGCTCTGGTGTCTGCCTCGGTTTTTCCT 
CCTGGACAATCAACATGAACTCCTCACCCCTCTTATCCACTTTGCATAAA 
C7GAAAATAACAAACCCAGGGCTCTTTCTGTCACAGGAAAGGGTI 1 1 1 T T 
•"TATAAAAT^AAACAGAGATGATTCAACACACCCAGGATATAACACATGG 
GCCATGAATCAAGGGCAGCATTGCTCTGGTCAGCCTGTTGTTTGGGCCCC 
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CTTGGCAGGGCTCTCCCCrGAATCTTCCCCTCTTGACTCCCATCANCACA 

GCACT CCANCTTTGTGTTACAGGCGATAAATGGGAAAGGGGTAAAT 
>Concig4 8 

CATTCTTAATTAGAGAAACGCTCATTAAACTAGACACCCAAATTCTCTGG 
GGGGGGATCATTCTTACAAGCATGCCCTTCTCTCTTAAAGAGAGAGCACT 
TTTTTCGCAAATAATGCTGCCATGAACATACGGGGTGCATGTATCTTCGT 
AATAGAATGATTTCTATTTTGGGGGGTATGTACCCAGCAATAGGATTGCT 
GGGTCAAATGGTATTTCTGGTTCTAGATCTTCGAGATCTTCCACACCGTC 
TTCCACAATGGTTGAACTAATTCACATTCCTACCAACAGTGTGAAAGCAT 
TCCTATTTCTCTGCAACCTCGCCAGCACCTGTTATTTCTTGACTTTTTAA 
TAATCG7CATTCTGACTAGCATGAGAGACAGTATCTCGTTGAGGATTTGA 
TGTGCATTTTGCTAATGATCAGTGATGTTGAGCTTTTTTTCATATGTTTT 
TTGGCTGCAAGAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTTGC 
C CACTTTTT AATGGGGGTTTGTTTTTT CTTGTAAATTTGTTTAAGCTC CT 

TATAGACTCACAATAACAAAGACATGGGATCAACCTAAATGTCCATCAAT 
GATATAACGGATAAAGAAAATGTGGTACATATATACCATGGAATAGTATG 
CAGCCATAAAAAAGAATGGGATCATATCCTTTGAAAGGACATGGATGAGC 
TGGAAACCATGATCCTCAGCAAACTATGCAAGAACAGAAAACAATTGTTG 
CATGCTCTCACTTATAAGTGGGAGCTGAACACTGAGAACACAGGGACACA 
GAGAGGGGAACAACACACATTTGGGGCCTGTCAGGGGTGAGGTGGGGGAG 
GGAGAGCATTAGGAAAAATAGCTAATGCATGCTGGGCTTAATACCTAGGT 
GATGGGTTGACAGGTGCAGCAAATCACTGTGGCACACATTTACCTATGTA 
ACAAACCTGCACATCCTGCACACGTACCCCAGGACTTCAAAATAAAGAGA 
GACAATACTTCTCCCTTAAGTGTCTACTGTTGCTTTGCAATAAAAAGTTC 
CTGCCTTTCACTTCACTCTGACTTGTCCCTGAATTCTTTCTCGTGATGGT 
GTCAAGAACGTGGACACTGGCTGGGGCTGGAGACTCACCAGCATCCGGAG 
ACCCTCCTGAGCCCTCCAGCAATACAACTTTGACACAAACTATGAAATCA 
CAGATCCAAGAAGCTCAAAGAACCCAAGCACAGGAAACATGATGAAACTA 
CATGAAGGAACATCAGAATTGAATTGTTCAAAATCAGTGATAAAGAGTAA 
ATCTTAAAAGCAACCAGAACAAAATATCCATCATATACGCAGAAATAAAG 
ATAAGTATGACAGCAGATTTACAAATAGAAAAAAAAACAAGTGCAGCAAC 
AGAAACAAACTATCAATCCATAATTCTATACCTAGTGAAAATTTCTTTCA 
AAACAAAGGTGAAATAAAAAAATTATTTTCAGGAATACAAAAGCGAAAAA 
ATTAATCACTAGCATTCATCACTGCAAGAAATGTTAAAGGAAGTCCTTTA 
GGCAGAAAGAAAATGATACAAGGTGAATATTTGGATCCCTGCAAGGAACT 
AAAAAGATCCAGAACTGATAACTTAATGGGTAAACATGTAATTTTCATCA 
ACAAGTGAATGAATAAACAAATCATGATATATCCATATGATAGACTACTA 
CTTAGAATACAAAAGAAGAACTACTTATGCATGTGATAACATGAATGATA 
TTCAAAATTATTATTGAGTGAAAGACACCAGATCAAAACAAAGTACATAC 
TGTATGATTCTGTTTATATAAAACTCTATAAATTGCATGCTCTTCTATAG 
TGACAGAAAGAAGATCAGTGGCTGCCTGCAGACAGGAAGAGATTACAAAC 
GGAAATGAGAATTCCTTAAGAGATGATGGACATGCTCATTACCCATCATA 
TGTATACAGCCATAATGGTTTTACAGATACATATATATGTACACGCCAAC 
ATAAATATAAGTTATCAAATTACAGTAAGTTCTGACTTAATGTCACTAGG 
TTCCTGGAAACTTTGACTTTAAGCAAAATGAXGTACAGTGAAACCAATTT 
TACCATAGGCTAATTGATATAAAGATGAGTTAGGTTTTTGGTTTTTTTTT 
TTTTGACATGAAGTCTCGCTCTATCGCCCAGGCAGGAGAAGAAGAGTTAG 
GTTTTACAGCATGTTTCTGGTCACAAGAACATCATCAAACTTGTAAATAA 
AGGCACAAAACACTTCTAATATTAAATATCAAAATAAATATGAGTTATAC 
AGAATTTAAGAAAGATTAATAAAAACAAGTAAAATCATTATTTATGGGAT 
TTTTGGTAATCAGTGAGTTATGTGGTCATAGTGGAAGTGGGTTAAGTCAA 
GAAATAAATGTTTGCAAAACAAAAATTTTAAAGATCCTCTCCTACCACCA 
CACAAAAAACAAGAAAACACGGTGGGCTCGCTAAGCACTTTTGTACCACT 
CGTATCTTATGCGTTTGTATGATTATTGTAAATGCTTTATGATAATTTTT 
AGAGACAGGGTCTCACTCTGTGTCTCAGGCTGGAGTGAAGTGGTGCAATC 
ATAGCTCACTGCAGTCTCAACCTCCCGGATTCAAGAGATCCTCCCACCTC 
AGC CTCC AGTGTAGCTAGGACTACAGTTGTGTGCCACCATGCCCATCTAT 
CTTCTTTTTTATTTTTTGTAGAGACAGGGGTTGTGCTTTGTTGCCCAGGC 
TAGTCTTCAACTCCTGGGCTCAAGCAATCCTCCTGCCTCAGCCTCCGIAA 
ATGCTGGGATTTCGGACATGAGCCAGCAGCACCTTGCCCAGCATTTTATT 
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TCATAATAATTATAAGTCATTCC7TCATTCATCTTACAACCCACTTGTTC 

CAGTTCAGGATCTCGGGTGACCAGAACCTATTAACGTTCACGCACAAGTC 

AGAAACCAGCCCTGGACAGGACACCATCCTACCGCAGGGAGAACTTACAC 

ACCCACACTCACTCAGACTGGGACCATGCAAAGAACCTAACGTGCACTTT 

GGAATGTGTGTTCCATACCCACTAGAACAGCTAAAATTTAAAAGACTGAC 

CATACTTGAGTGTTGAACAGGATGTGACZACIAACTAAATCTTTTAAGCGCT 

T C3G C GT AAATGGCACAGC CG CTTTGGAAAACAGTT GGCAGTTTTTCAAG 

TTAAATATACCCAAACTCTATGATCCACTTCTCAACAATCAAACAAGAGA 

AATAAAAGCAATGTCTACACAAAGATGTATACACAAATGTTCATTGCAGC 

CTTAATTATACTAGCCCCAAGTTGAAACAAGCCAAATGTCCATTACCAGA 

TGACTGGAACATACAAATTGTGGTATATTGATACAATGAAATACTACTTA 

GTAATAAAAAAGAAAGAGCTATTAACATAAGCAACAACATGGATGAATCT 

GAAAACAATTATGCTAAGTGAAAACAGCCACACAAAAGTTACATACTGTA 

TG AT CACAT CT AC ATAAAATTACAGAAAAGGCAAACTAAT CTATAGACAG 

AAAAGCAGATGAGTGGTTACCTAGGGATGGGGCAGAAGGGACGAAAGGAT 

GGATTGCAAAATAGCACAAAAATATTGGAGGGATGACAAATATATTCATT 

ATCTTGATTGTGGGGATAGTTTAATGGGTATATATAGAGATCAAAGCTCA 

TCTAATTATACACTTTAAATATATGTATTTCATTGTGCATCAGTTATTCA 

T CAA C AAG A CT AT AAAATAAT AT ATG C CT ACAT ACATTTTT AAAT ATT CA 

AAATCT CACAGTTATATACATAAATG CAACTGAATATGTATTCAGATGTT 

TTAACAAGCAGAAAGGACTGATTAAACTCATGACAGCGGCTGTTTCTGGG 

AAGGGTGTAGGAGACAAGAGATGGAAAAGAGGATGAGAGCCAGAAGAGAC 

C CTTGTAATGTTTC CTTTCTTTTAGT AAAAATATATTGACAGTTAAAGCT 

GAGAGGTGAGAATAATAGTCTCATGGCTTTTGTGTCCTTAAAATTTCACA 

AACTAAGTGAAATGGGAGAAAGCAAAAAAATAAACTTAAATAAATGTTAT 

ATTGC C CAAAAAGAGATTTAAAATGGAGGTTAGACACATGAGACTTACGT 

TCTCAAAAAAGTAGAATCTGCAGGGAAGTTTAACAACTATAAAGAATTAA 

AATCTAGCTTCTACCAGCCCAAAGCCTAAAATGTTCTGCTTTATTCTTCC 

TTATTATAATTCATAGGTAATATATTTTATGTTTGCAAATGAATGCAGTG 

ATATTAGATCTCTAAGAGGTGCTAAAAATGAAAAGTACATATTCCAATTT 

TTCCCAATTTTCCTTCTCTTTCCATGAATGAAAAATATACATATTTGATG 

ATTTCCAAGTTTATACAACCGATCTTTCTCTTAGTTTTCTCTTACCAAAT 

TCCCTCCCTCACTCAGCCACCAGCCAGTCCAACTGTGCTACCTGCACAGC 

AGCCCTCATACCATCCACACTCTCATCAGGATCCTGCCTGACCTGCGAGG 

AGCAGCAGCAAGAAGGAGACAGAACCTCCACGCTGAGCATCTCAGGGCTT 

T CT C AGAGACTC CAGAGGAC C CTGATAGGGACAGAGCCTGGC CAGCAATC 

CATGCTGCCAGCTGTATGATTGTGGGCATGTAAATTCTCAACTGAAAATG 

3GTGTAATAATAACATGTTCTTCCCAGAATGAGCTTTATGAAGATCATAT 

AGCTGTTTGGAACTCAGACAAGCACTGGTAGGAATACAAACAGGGGAGCC 

AACAGC CTATAAATAATACTTTAAGAAAGGGCATGAATGTAATTACTTAG 

GAACAAAAGGCAAAGTGGAGAGATGCCTAGGACTGAGCTGGACAAGCTGC 

ACCCTTTAGTGGCTCAGCCCATGGGCTGACAAGGAAAATGGAGGAGCTAC 

CAAAGAAGGTGGAAGGATTCTGGGAGAGTGGCCCTCACCCTGCCCAGGGC 

AGGGCTCAGTGGGAGAGAGGGAGATCTGTTATAAATGCTGCCAGGAGGTC 

GAGTCATGTGAGAATGTCCATGTGAAAACATCCACTGTGTGTATCTAAAG 

AGAGTGGCTGTAAAACAGGTCAGGGTCAAAGGTCTTATTGTCTCAGATGT 

TATCTGCATGCATTGTCTCACGACCAAGAAAACTAAGGAGCATGGACACA 

AAGGGTTAGGTTGAAGCAAAAATTTAATAAGTGAAAGAAGAAGGCTCTCT 

GCAGTGGAGAGGGGAGTCTGAGTGGGTTGCCACTTTGACAGCTGAATCCA 

AAAGCTTTTATAAGAAACTCTTCT CATATCTGCAGCTGTTTGAGTAACTT 

CTCTTACCTATAAAACTGTCTGTATAACTCTCCCTTATCTATGCAGCTGT 

GGGATGTCTCCAGGTAAGCATAAAGTGTAGCTTCTCTTGTTTGTATAACT 

GTGGGTTTGTTTTAGGCAAGCCCCCATCCCCTCCCTGTGTAAGCTCCCAT 

GGAGCCCACCATGTGCATATCTGAGAAGTGGAGGAAGCTTTCTCTGGGAG 

CTCACTGATCGTACAAAGAACAAGAGGCTTCTGTGCCGCTTATCTATTCA 

GGTGCAGCCTGAGTTTTCCCCAGGCTGCTCTATTTTTGCCTGTAGCTATG 

ATTTTTCAGGCAGGCTGCTTCTCTGAAGACTAGCCTTAACTGTCTACCTA 

^CAGATTTTTCCTTTTCTTCTCCCTCAGCTGGTTCCCCTCACCAAGGCTG 

AGCAAGTGAAAAGGAGGGCACAGGGCAGGCCAGTAGTGAGCAGCAACAAG 

GAACTAAGACAGCAGAAACCACTCTTCACACCTGGGTTQAAAGGGGTGGG 



FIG. 4 (31 of 61) 



WO 99/06426 



PCT/US98/16102 




TGJGAGC 




.LniouAu^L . t_ LTTAGCC CGTGTTTAGGGT AATTAT GGC 7 " CCAA 




— «-^w-» w «rtwww\vj lull 1 uTATGCC CTGAGTGGC? 

AGCTGTAGATGTGGGTGAGTTGCAGGTGGCTAAC7AGAAATGGGGGACTC 
CTGTGTGATTGATTAGGAGCATGTTTGGCTTTCTCTGGTTGGTCCTACAT 
TGGAAGAGGGAACAAAAAATTTAGGGCAGTTGTCAGTTATTAATCAAGTG 
TTGGCCATTTTTGACTGACTGTTACAGGAGTGACTGGCTCCCTGGATTGT 
TTGCTAGAAATAGTGGTCTTCACTTCCTGCAAGTCTGACTTTCTGGTAAT 
AGGCTTCCTGGGTTGGCTATTGTGGATAATAAGTGGGTTTCCTGAGCrGA 
TTTCTGCAGATTGTGGATCAGAGTTATTTTATATAAACAGTCTGACCATT 
TTCCACTGGCATATTCCATCTTCCAAGAGCTGGCCAAGCTGCTGTCTTAT 
CTGTCTCCCCCAGCCCCTCCACTCTGGCTGTGAAAATACAAGCCACTAGG 
TGAGGAATGGGGACAATTGAAGACTGAAAGCTTTTCTTTGCTGGGTTCGC 
AGAGCTGAGGAAAGAAATGACAACATCCAAGTGTCTGCCCTGGGCCAGTT 

TTAGGACTGTAGTGGTAATGCAAGGACTGTGTGAGTTTATATTTTCATTT 
GTCTCTCTAACTAAGGTGGAA4AA a imirirs » » 




*■ - - - 4 w 1 v-tUftACATTGCAGCCATTAGCC 

ACAGGTGAGTATCAAGCACTTTAAATGAGACTGGTCCAAACTGAGATGTG 

CTCTGAGAATAAAACACACAGCAGATTTCAAAGACCTAGTACATGCCCTG 
ATTrCAAGCTATATTACAAAGCTGTGGTaATraaa irir.-p»Twv«» t-iw 




AAACCCGTGCATGTATAGTCAACTAATCTTTGACAAGAGTACCAAGAATA 
CACAATGGGGAAAGTCTCTTCAATAAGTGGTGTTGGGAAAACTAGATATC 
CACATGCAAAAGAAAGAAATTAGACCCTTGTATTACACAAAATCTAAAAT 
rAATTCAAAATAGAAAAAGACTTACATGTAAGATCTAAAACCATAAAACT 
CCTAGAAGAAAACATAGGGAAAGAGCTCCTTGACACTGGCATTAGCAGTA 
ATTTTTCAGATATAACATCAAAAGTACAGGCAATGAAAGCAAAAACAAGT 
GAGAGTATATCAAACTAAAAAGTTTCTGCACAGCATAAACAATCAACAGA 
GTAAAGACATGACGTATGGAATGAGAGAAAATATTGACATCTGACAAAGG 
GTTAATATCCAA AATATATAA GTAATTCACACAACTCAGTAACAAAAGCC 
AAATAACCTGACTTTTTTTTAAAATGGGCAAAGTACCTGAATAGGTATTC 



w* A ^»_i«Ai.uAj.^» J AVi< U uiTA»a«ATauWVACJ^GATATCATCT 

CACCTGTTAGAATGGCTATTATTAAAAAATGAGATAAGTGTTGGCCAGGT 

GTGGAGGAAAGGAAACCCTTGTACATTATTCATAGGAATGTAAATTAGTA 

CAGCCATTATGGAGAACAGTATGGAGATTCCCTAACAAAATTAAAAATAG 

AATTACCATATGACCCAGCAATTCCACTTCAAGGAATACATTCAAATACT 

ATCAGTATCTCAATAAGATACTTGCACTCCTATGTTCGTTGCAGCGTTAT 

TCACCATAGCCAAGATACAGAAACAAGTTAAATGTCCATCAACAGATAAA 
TGGATAAAGAAAATCAGGTAr AT&T&t&t»t» nn x * T*«~m. mm^. „ 



wwu%av.w i.u«u*it i^vaAlAACCTGGATAAACCTGGAGGACATTATGC 
TAAGTAAAATCAAAGCCTGACACAGAAAGACAAATACCACATAATCTCAC 
TTACATATGAAATATGAAAATGTTAATTTTATGGAAACAGAGTAGAATGG 
TAGTTGCCAGAGCCTGAGAGTAGAGAAAATGAGATGCTTGTCAAATCAAA 
TCATCACATTGAATATATATAATCTATTTGTCAATTAAATATTTTAAGAA 
TAAAAAATACCTGGCACCAAAAAAAGAATGCAAAATGTCTCAACAATGT T 
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ATATGTATTGCATTTTG^^GTGATAATAATTTGAATATTAGGTTAAATaa 

AATATATTTGAAAAATTAACTTCACCTATTT CT 'rrcCATTTTTGTTAACA 

TAGGTACAAAAAAAAATTAAAATTACCTATGTGGCTCATGTAGGTGGCTC 

ACATTATACTTTGATGACACTATACAGGCTGGTGACCATATATCTCTTAG 

ACTAGTCTAAGTGATTTAACAGTGGTTCCAGAAAGATCCAGGTTTAACAC 

CAATGAAAGGGCCAGCTGGCTTAGCCCAGCTTGTGTGGGAAATGTTGGGG 

AGTGGTT7AAGACAGGGAAAAGCAAAACTTTTGATGCTATTGACTTTTTG 

AAAAATCT7TTGTGGCTGAAAAAACCAAAACATTATT 

>Concig4 9 

GCTCGAGTGTGTCTCTAAAGCCTTTCCCCCATTGGCTCCACTATACGCAC 

TCTCCTGGTTTCCTCCCCTCTAGCCGCTG7CTTTGGTCTCCTTTCTGATT 

TTGCTGCGTCCTCTGTCCCCTGAATGATTGCTTCTCCACTACGGGGTGAT 

TTTGCTCCCCAGGGGACATTTGGCAATATCTGGAGAGGTCTATGGTTGTG 

TTTGAGGGTGTTGCTACTGCCATCTAGTGGGGAGAGGCTAAAGATGCTGT 

TAATGCCCAGGACAGTCCCCATAACACAGAATTATTCAGCTCAAAATATC 

CATGGTGCCAAGATCAAGAAACCCTGCTCAAATATTAGCATGXGCTGAAG 

GCCCTTCTCTTTCCTTTAGCAATATCTGCCTCCTTAGGGATCTTTTCTAG 

TCTCAGTGGTTTAACATTTAAAATCCCAAATTAGGCAATAAATTGGGCCC 

CAAACTTCGTTAGTATAAAATGTAGAACTGTGTTATTAGAAGGCTAATAA 

AATGACCTGGTGAGCATCTGCAGCTAGCCTCTGAGCAATTCTGGGGACCA 

CGTGCAAGATAAATCCATCTGTTCCCTCTCTGTAATGTGGCGCTACCTTG 

TGGCCGATTTTTCCTCGGGTTAAATATCTCTGGGGATGCAACTTGTCGTG 

GTTAATGGCTGTGTGAGGCCAGCGCGTGGTGATAAAGGAATCAATCAAGA 

CAATATTGAATTTAGAAAGGCAGATTTATTTAGAGAAAAGGAGAGATACG 

TTGCAAGGGAGCAATGGGCAATACAGCAGAGGGAAGGCTGTCTGCAAAGA 

GGCAAGGGCTACGTATGACGTAGGGCTGCTTAGGCTGAATGCTTGCAGAC 

AAGATGCTTGCGTGCAGGTGGGCTGTGAGCTGAGTGCTTGGGTGCTAGTG 

AGCCATTGGCAGCTGACCCTATTTCTTGGAA CATTCGCTCC CT GCAAGCA 

riTTAATGTTAAACCGCCAGGTCAGTTTGAATITrCITTTTTCTI M iT'rrr 

TTTTTTTTTTTTGCCTTTAGTAGGACCTGCCGTTGTGAGACTATCTGAGG 

TAAATTAGACACCCTCCTGGTTTAAGTCACCGCTCCAGTGACTAGGCAGG 

GAGCTCTTCCTTGAAGAGGGTGTGGGCAGTGGGTACTTTG CATGTTGTC C 

ACACCAGGCGAGCTGCTGCTTCAGGGCCrTTGCATTTGCTCTT TrCTT TG 

C CCAAAATGCACTTCTCTCACTGTTCACATGATTTTTCTCCCTCTTTTCC 

TTTTAGTCTTTGCTTAAAXATCACCTTCTAGGGAGGCCTTCCCACACCAC 

CTCTTCAAGATTTGAGGGTATGCACCCCCACCCCTAGCCTTCTTATCCCT 

CTCCACTGCTTTCTTCTCAAAGCACTTGTTACGTTCAAATAAAATAGATT 

AGTTACTTTATAGTTCTAATTTTACTATTTTTTGTTTACTTCATCAATAC 

C CATGTAATCTCT GGAAGGAACGTTTCTTTTTGTAGTGTATTTCTAGCAC 

CTAGAACAGTACTTGGCACATGGCAGGTGTTCAAAAGTATTTGTTGATTA 

TTTTCTCAAAGGGCATGGAGTCTTAGAAGTTTGAGAACACAGTTCTAAGC 

ACAGCTGTTTAGAGACTATGGATGATGCTAATGGCTGTATTCCCAGTAGG 

TGGGGCAATTCTCAAATTGACCTGGAATCCTTGAGATCTGGGGACAGTCA 

CCAAGCACTGGGCTCTGTGGGGAGAGATGTGCTGGTTTTTAGAGAGGAGA 

ATAGCATCCTGGGGGACTTGGCCCCAGGGCTTTCCTGTCCCAATCTCTTC 

CCAACTGAGTCCCAGAGGCAGGAGGCCTTGTCTGTAGCTGGTCAGTCCTG 

TAACTGTTTCCCTCCCATCTACACAGATGCAAAGAAGGCTGAGAAAAGCA 

AGCTGTCAGGTGAGCAGGGGCCCTGACTCCTCCCCAGAAGGCACTCAGAA 

CTTCCATAGGGCAACTGGAAAGAAGGTTCTACTTCCTCACCGGCAGCTGT 

TGCTGGGGAAAAAACCAGCCTCAGGCCCTACCCTGTGCTGAGAACCTGAA 

TCCAGTATCAGGTTCTCCAACAAACTTGGATCCAGCTGACCCTCACAAGG 

GGTCAGATGCAACCTTGTAGCATATGGAAAATGGCAGCAAGGTCCTTGTG 

TGGACTATGCCTAGAATCTAAATTAAGACAAGGCCTCAGAGGGGCTAAGT 

GACATCTGTCTCCAAAGTTTCACAGCTAGTGTGTGACTAAATCTTGATTC 

CACC CTCTCAGGTTTTAC CATAATC CCAAAAAAGGTTGAAACAAGAAAAG 

TTATCTTTGGGCAATTACCTCTTTCTGTTCCTTGCTTTACCTACTAATGT 

TCTAGGCTCACCCTCTGGTCTGCAATCTCACTGAACTGACAGATCCCTCA 

TGGCCTAAAGGGTTTTCACACTGGGTTGACTAGGCTCTCCCATTGCCTGT 

CCTACTGTCTAAGGCACCTCCTGGGTAGGGTGCCCAGCGTCATTCTGATG 

CTGCCTGACTTTCCTTCCAGCTACTTTTGAAACTTGGTATCCATGGCAGA 
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. ...•**w^* WU iUAiuvjunv3^uii l ^LKiAAATATTTGAGT 

GTAAGAGTGAGTTGCTTTAC CT CCAGAATATC CTTCCAATGGCTCTGAAG 
CAGGCTGTGGAGTCCTGCTGGCTGATCACAGTTCACAGGTGGCTCCCAAA 
CCTGTGGTCTACATCCATCCTTTGTCAGTGTCACTGCCATTGTCCCACAA 
ATCJT CATTTGGGC CTAGCCC CTGGGATAGTAATCAGTCTTTACATAGATA 
TACATTGTGCTTTACATCCACAGTAATTCTGAGTGGACCTTAAAATAAAT 

GCCTTGiAAGGTGCCCCGTCTTTGAGTGTAGACATGGACTCACAACGAGT 
CCACTCCTGCTGTTCCTCTGCTCTTGCTGAGGCTT^CT^^T^G 
CTGCTTTGCAGAGGCTGGCCAGCTGTGGTGCC7GAGGCACCTGTGTCTTC 
ACAGCACCAACTTGCATGGTGGCCACGGTGTAGTTGGAAAGGGATGCTTA 
GATGGGAGGCCAATGGGAGCTGCTTCAGGAGGCAAATCCAAGTCACAGAG 
ATCGAGTCACCGAGAGCATAGTAAACTCAAAATCCCTTCTTCTGCTTAAT 
AACTGAGATGCTGTCACTGGGTTAACCTCACCAAGCCTTGTT5TGTCTTC 
ACTTAGAGTGATTTCTGTCTTAGAAGGCTCCTCATATCCTTCTGGGGAAG 
GCTTCTAGTGAGTCCACAGATAGCTGGACCAGGCATGTCCAGAAATAATC 
TGATTCTCACATTTGAGTTAGCCAGCGTTCCCAGCTATATCCCCATTTTG 
TGTCTATATAAGTTACCAAAGCCCACAAGGATATTAGGTGGCTCCTTAGT 
TTGCTTTATGATTATGCCTTGTGTGTGTGTGTGTGTGAGTGTGTACGCCT 
ATGAGGATTCCTTCTCTCCCGTTCTTGCTATGGCTTCTCTTCCCCACTGA 
TGGGCTGTAGTTCCCTGTCCTTTTGACTTTGGGCTTAGTCATGTGACTTT 
TTTGCCAAGGGAATGTGGGCAGAAGTAACTGGGAGCCAGTCC CAAGCTAA 
GGCCTTGGGAAGCATGGTGAGCCTATGCCAGCTCCCTCAGAACTCCTTCC 
CTTGGCCATGAAGAGAGAATAACCTGGATTGTACCTTCAGCCCATG7CCT 
AGAATACAAACATGGAGAATAATGAACTTGACTCAAAGGCTGAAGGGCAG 
CTGAGCCCACATGAGGTCAATTGAACTGCAGCTACCTACAGACCTGAAAG 
TGAAATAAACATGTATAAGTCTCTGACGTTTGGGGTTTGTTTACATAGCA 
TTATTGTAGCAGAAACTTAAATAATACTGGGGGCTAAATATAGTGGACCA 
GTGACAGCACAGAATGCTAAAATt^GTGATTGTTACTTACATCACAACC 
CTTCATCTCTGTTGATGGACACTAAAATCAAAGTGGCAATTACTCAGAGT 
XGGGAGTCATTGAGTTGCATCATTGTTGTTTAGAATCATTGACAGTTTGA 
GCTCTAAGTGATTACAGAGATGGTTTCCTCAGCTACAGGTAAATAAACAA 
AGGCACAGAGAAGTAAAGTGACTTCTAGAGGGCTTCATTGATATTTAGCA 
3CAGAATCAGAGCTAAACAATGAGTCTCTCATCTCCAGCCTTTCTATTCT 
TGTTTCCTAGGTTGGGATTTTGGGAAATAGTGCAGAGAGATTAGCAGTAG 
TGACATGGAACAATGTGAGCCTCAGCTTCCATCCCTGAGGCTGCCTTCAT 
CTGCCAGGGAAATGTCTCTGTGTGCAGCCTTGCCCTCTGCACACAGTGTG 
TATGGCCACCTGAATAAGTGTCCTTTCATAGCGACTAATGGATTGAAATG 
GGTGCTAGAGCAGTGCTTCTAAAAACTCCATGTATTAATCATCTAGGGGT 
CTTACCAAAAACGCATGCAGATTCTGATTCAGTAGGTCTGGAGTGGGGCT 
TGACATTCTGCACTTGTAACACATGGACCACACTTTGAGTAGCAATGTAT 
TAGATCATTCCAGTGGAAACATGTATGAGTGATGGAATGAACAGATATAA 
TTAATCCAGGTCTGGTAAGTGAGGTACTGATACATATTAAGTTGAAGTGA 
A TTTCACATCAA AAATAATGGTTACACAGTGACTTTTACTGCCCCCAAAT 
TCTTIXTCTTTTGAGTGGTTTCAAAGTGAACTGAGCC^GCCAGGTTAAGTC 
CCTG GTTTA GTGTGTGATTAGAAGATTTGATCCAGCTTTCTCCTCCTTCT 
AATTCTTTAAATATGCAATGGCCTTCTAGAAACTTGTCTCTCAGGCTCCC 
CATGAGCCACCTGTCTTAATATCTTCCCCCCCAGGACATTTCCTGGGTCA 
AGG AAGGAA TCAGGGACTAGGAAAAGTAGAAAGGTTGCCTGACAGTGAGA 
AACTTTTTGCACTCCTATTTGTTCAATTCTAAAATGTGGGTATTGTTGGG 
GCTTCTAATTGGAATCTAACCTGAAATTCAGGCATGTCTAGCTATATATG 
ACCAAGAATTAGGATGAGTTCACTAGAAGCCTATTTTCAGGAGAGCGGTC 
AGTTAAATTGAAGTTT ATGGGT TTATGGTAATGGGTTGGGGAGTTTACTT 
CATTAGCAATAGCAACGTTTTTGAATCAGAGAAGTGATTTTGAACACACT 
GTACATAGTTTTCTCACTTAGATTTATCTCTGGGTCAACCCTTGTTGGAC 
CTATATTAGAATCATTTAGTGAAGAAAAGGTGGGTGTCATTAGGAAAAGA 
GCCATTTATTCAAATGTTCTGTTTGACATTAGGGCACTGGCAAGACTACA 
GAATCAATAGATATTTAAAAACAGCCAGGTGCGGTGGCTCACGCCTGTAA 
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TCCCAGCGTGA7TTGGC .rTACTTTGGGAGGCTGAAGCGGGTGGATTG -w 

TGAGCTCAGGAATTCAAGACCAGCCTGGTCAACACGGTGAAACCCTATCT 

CTACTAAAATACAAAAAATTAGCCGGGCATGGTGGCAGGCGCCTATAATC 

Z CAGCTACTTGGGAGGCTGAGGCAGGAGAATCGCTTGAACC CAGGAGGCG 

GATG7TGTCATGAGCTGAGATC0CGCCATTGCACTCAAGCCAGGGCAAGA 

ATAACAAGACTCTGTCTCACAACAAACAAGCGAACATACGAAACAAACG7 

AACATC CAAACTAGCAGGTACATGCCGTGC CAGTCATGAC C CATGGTCAT 

AAGATGTCTACAGCTCAGGAAGCAGCTGCACAATGCCTGCATAGACAAAC 

TCTTATGAAAGCAGAATGTCCTGATGTCTCCATAACACATAACAGTGTAT 

GCTTTTATTATGGTCATACTCTAGCTGTGATGTACCTACGCTCTAATATG 

CCAACGATAGTTTTCTTTAAATCATCAACATAATAAATGTCATGCTGTCA 

GTCCCCCACATGTAGACATAACTTAGCTGGTACATGGATAAGAAACCTAT 

ATTAGATAACCTTAGGCCAGGTGTGGTGGCTCATGCCTGTAATCCCAGCA 

CTTTGGGGAGGCCGAAGCGGGTGGATCACGAGGTCAGGAGATCGAGACCA 

CCCTGGCTAACACAGTGAAACCCCGTCTCTACTAAAAATACAAAAAAAAA 

TTAACCGGGCATGGfTGGCAGGCACCTGTGGTCCCAGCTACTCAGGAAGCT 

GAGGCGGGAGAATGGCGTGAACCCAGGAGGCGGAGGTTGCAGTAAGCCGA 

GATCACACCACTGCACTCCAGCCTGGGGGACAGAGCGCAAGATTTCGTCT 

CCCAACCCAAAAANCNAKNNNAAATTTGCACCCAAATCTGACTAATTCCA 

GAGCCAATTCCAATTTAGAATCGTTATATCTCCCTGGTGAACTGAAGCTT 

TTATCTTTAAGGAGACACACTCTTTATGTCTACCAATGCTTATTGCCTTA 

AAGTCCACTTTGTCAGATACAGCTGCTTTCTTTTAATTAGTTTTTGTGTG 

GTATATCTCTTTCGATCCTTTTTCTTTCAGCCTTCTCCATTCTTACATTT 

TAGATATATTTCTTTTTTCTTTTTTTTTTGAGAGAGAGTCTCACTCTCTC 

GCCCAGGCTGGAGTAGTGCAATGGCGCGATCTTAGCTCACTGCAACCTCC 

ACCTCCTGGGTTCAAGCAATTCTCCTGCCTCAGCCTCCCAAGTAGCTGGG 

ATTACAGGAGCCCACCACCAAGCCCAGCTAATTTGTTGTATTTTTAGAAG 

AGATGAGGTTTCGCCATGTTGGCCAGGCTGGTCTCGAACTCCTGACCTCA 

GGTCATCCACCCACCTCGGCTTTCCCAAAGTGTTGGTATTACAGGCGCGA 

GCCACCATGCCCAGCTGATTTTAGCTGTATCTCAAAAACAGCATGGGTTC 

TGTTTGCTTTCCTTATTCAGCTTTATAATGTAAATCATTTACATCAAACA 

TCTAATACACCATGGACTGTAAAACACAGCCATATTTTATGTATGAATTA 

AAAAAAAAAACACCACCAATTAGTTCCTGAGACACACACCTTAACAATAT 

CT CT GT G ATGTG CAT AAATCAAT CACAT CAGTTT CT CTG CAC CT CAAAAT 

TTCTTTCCTCAATTCTCAGAGATATGGCAATTTCTCTGGTTTTACATTCC 

CAGAAGCAAAGAAAAAGTACACAGCTTCTTCAAGTCATGAGTAGCTTCTT 

TTTTATAGCTCTTGGTGTTTGCAAAAAAGATTGGAATTGCTT CACTAATA 

~TAAAT7TTCATTCTGCTGCTCTGTT7CTATGACAAGTCAGAGGGCATCT 

^TTGAAGACATTCTAAACAGCAATTAAACTCAAAACATGTAATGACAAT 

GACACACAAAACTCAACTGATGACCAAATGAAGAGTTCCAGCCAAGTTGA 

CACAAGCTGGCTGACAGAGCTTGTAATAGACACAGCTTGGCATATGCCTC 

GCCATTTCAGAGATGTAAAAATAGGAATAAATGTTTTCCCTT AAAT CAAT 

GAAATAGAGCATTTGGACTGAAAATCTACGACAGTTAT AGTC TTTTCTAT 

TCATTATTCTGATTCTGTTTCTTCTCCCCCTTGCTTrCTTTTAGTTTGAA 

^ATTTTCTATCATTTCATTTTrCTrCCTACTAGTTTGAAACTTATGCATT 

^ATTTTCTATTTTTTAGCACTTACCTAAAATTACTCTGTAATCCATGGAT 

CCTTAATTTATTTAAAAAACTAATGTTAATGAGTAGCTTTATTTTCCTCC 

CATCTAATTTAAGGCCCACAGAACACCTTCACTTACCTCAATCCTCTCCC 

AACTTACATGCTTTTAATGTCATATATGTTAATACCGTATACTT TTAAA A 

CTTTCTAAAATAGCATTATTTTATAGCATGAGTGTTCATTTACATTTTTG 

CATATATTTAGA AT ! J L"1 L 1 1 I GCTCTTCgm'CTTCTTCTATTTATGACT 

CCCCTCTGGGATCATTTTCCTTCTACTTGAAGTACATAGTTTAGAACTGC 

ACTATTCAATACAGTAGCCACTAGCCATGTGTAGCTATTGAAGTTTAAAC 

TAAGTAAAATTGAGTAATATTAAAAACTCAGTTC CTTCATCTCACTAGCC 

ACATTTCAAGTGCTCAGCAGCCACATGTGACTAATGACTACTGTACAGCA 

AACATAT AGAACATTTC CAT CATGGCAAAGAGCTCTATTGATAGTGTTCA 

^CCAGAGTTTCTGTTCCAGGACCAAACTGAGGGTTGGGCTGCT ATTTCT C 

ATGGCCCAATAACAAGATGCAGATGAGCTGGGGAGGAAGAGAGTTTTTAT 

TTCTGCAACCAGTTACAGGGAGAAGGCCTGGAAATCATCACCAGGCCAAC 

T CAAAATTATGACGTTTTCCAGAG CTTATATAC CTTCTAAGCTATATGTC 
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TACGTGTAAGTGTGCATlvJACCTGAAGACGTAAGTGATTAACTTCrmH 
A7C7G7AAC7AAGG7C7GAGTCCGGAAGA7CT7CCCC7GGAGCC7CAGTA 




TAACCTGTTTGTTGAAGGCCTGGGAATTTCCTCCAAACCCCCATTAAACC 

TGTTTAATCCCAAATTGGTTCCGTTAAAAATTCCCTCCTTAATTTGTCCA 

A7I77AAAGGCCCAAAAAAGGC7GGGGCAAAC7CCTGAA7GGCC777G77 

ACA77 C CAAC C777G777AAAAACACCGG77T77AA7AT77AAC77AACC 

ATTTAArCTCTACTGAAACACTTGTTATATAAATCTGCATTAATGAGAAC 

TGGCCTGCGCCATATCTCCTTCTCAGAATATCTTAGGGTTGTGATCCCCT 

GTGTGAAGAGAATATATCTCTGGAGATCTCAATCTCTCTACCCCAAAAAA 

AATCTCACTCGGAGAAAACTCAGACTCTTATCTCCACAGCGCTATCTCTC 
TCCTCTCC 

>Contig50 

GCTTGTCTAAGATGGTGCTCCTTGTTGCTGTGCCTGCTTTCATCCTGGGA 
rCTCCCTTCACCATCAGGATTGCCTTCACCTCATTCCAGTCTTGGATCTT 
TCTTCTTGTTTCTTGAGTATTT 111111 1 T'l TTT7GC7GCA77CCC7TCA 

GTGGCCTCTTGGGAAAAGATGTGTAGGGAGAAAAATTTTCTTTAGAAACT 
TGCATATCTGACAATATATTTATCCTATCCTGACATTTGGTAGATAGTTC 
AGCTGGGTACAGAATTCTAATTAATTTTCCTTCCTGATTTATAAGACATT 
GCTCCATTTTCTT CTGG CTTCCAATATTGCTGCTGAGAAGTCTGACACCA 
TTCAAATGCCTGATTTTTTCCATGTGATTGTTGTTTTCTGTCTGGAGTGT 
TGTAGGATTGCCTCTTTATCTACAGTGTTCTGAAATTTCATGACGTAGGT 
CTTTCTTCATTCATTATGGTAGACACTCAGTGGGCCATTTAATCGGGAAA 
AACATGTGTTCTTCAAGTTCTACAAACTTTATTACTTCCTTTTTCTTGTG 
TCT7TCTCTGGTCTGTTTTCAGCCCCGAGTCTCTTAGATCTGTCCTCTAA 
TATTCCTATTGACTTTACTTCATTTTCTAAGTCTTTATCCTTTTGCTTTA 
CTTTCCGAGAGACCTGCTTAACCTTATCTCCCAACTCTTTTATTGAATTT 
CATTTCTTTTACTATATATTTTTTACTTTGAATACACCTCTCTCTTCCTC 
ACATTTTCCCCCATAGTATTTTGTCTTCAATTGACAGTTCTACTATCTTA 
TTACTCTGGAGATATTAATAATAGTTTTTAAATTTTTATTTATTTTTATT 
TTCAAAACZAGTGTCTTACTCTGTCACTCAGGCTGGAGTGCAGTGGTGTGA 
TCATGGATCACTGCAGCCTTGATCTCTGAGCTCAAGCTATCCTCCTGCTT 
CAGCCTCCCAAGTAGCTGGAACCACAGGCATGTGTCACCATACCCAGCTA 
ATTTTTTTG7TTTTGAGGTGGAGTCTCACTCTGTAGCCCGGTCTGGAGTG 
CAGTGGTGCAATCTGGGCTCACAGCAACCTCTGCCTCCTGGGTCCTGGTT 
CAAGCAATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGATTACAGAAACA 
CACTACCATGCCCAGCTAATTTTTGTATTTTTGTAGAGACAGGTTTCACC 
ATG7TGGGCAGCCTGGGTCTGAACTCCTGACTTGTGATCTGCCCACTTGG 
GCTCCCCAAAGTGTTGGGATTACAGGCGTGAGCCACTGCACCCGGCCACT 
AATT7TTAAATTGTTAATAAAGACGAGGTCTTGCTATGTTGCCCAGTATG 
G7C77GAAC7CG7GGGCTTAAGTAATCTTCTGCCTCAGCCTCCCAAAGTG 
77GGGA77ACAGGTG 7^GCCA CTGJUVTCTGACATTTTTTAAAAGTTTTC 
77C7C777ACCAAGTCTl^rriTCCCCTTTCTG C 7T lTlT GGG7TGTTTTA 
7TT7GA7C7CTATCTTGCTAGAAACTTTCTGCAGACGTTTAGTAATACTA 
GATT777GAGAG7GGGCAACTGGAAAGCTGATTGGAAACTCTGAATACAT 
GGG7GAGGC7TGTTGGCTGTGAGTGTCATTGCTTGATGTCCTGGCAAGGC 
CAA7GGG777GGGACCCCTACTATTAGTATAGGCCTGATTCCCTGGGAAA 
GGC7CT777GATCTCCTGCCTGGAGGATAAAGGCCTGGCTACCAGCCTTC 
TG7G7G7AATGTGAGGGAGAAGGGC7GGAG7A77CAACATCA7GC7GAAT 
CC777CAA7GATCATC7TGTTTTTAGTAATCTCC7ACC7TAAC7CTCTG7 
C77C7GC7AG7A7G GGAAA GA7GACC7GAAAA7C7AACCATT7AT7TTTC 
CCCCA77AA7A7CATT7TATGATTATTCAGAAGT7AAATAATTGTCATGC 
7G7CC7CCAAAAAGACTGAATCAACTAGCAACAAATAAGAATTTTCTCAC 
AGC7C7GCCAGCATT7TAAAAGAATAGC7TTATTGAGCCCAGGAGGTCAA 
GGC7GCAG7GAGC7G7GA7TACACCAC7CTACCCCAGCC7GGGTGACAGA 
G CAAAAC C C7 G 7C7 CAAAAAAGAAA777 AAGGAACAG C777A77G77G7 A 
AAA7AGACA7ACAATAAACAGAGCACATATTTAAATTGTGCAACTTATAC 
777GA7A7AACCCTGTGAAAACATCACCACAA7CAAGATAG7GAA7ATAT 
77A7CACC7CC7GATACAG7T7AGC7C7G7G7CCCCACC7AAG7C7CA7G 
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77GAA77G7AA7CCCCAA7GC7GGGGGAGGGGC777G7GGGAGG7GA7TG 

AA77G7GGGGG7GCAC7TCCCCC77GC7G77C77GAGA7AG7GAA7GAGC 

TCTCATGAGCTCCCCTTCACTCACTCTCTTTCCTGCTGCCATGTGAGGAT 

G7GC77GCC7C77C777GCCC77C7GCCA7GA7G7G777CC7GAG7CC7C 

CC7AACCA7GCC7CC7G7ACAGC77GCAGAAC7G7GAG7CAG77AAA7C7 

C7777 C T TCATAAATTAC C CAGTCT CAGGTGGCTCTTTATAGCAGTGTGA 

AAAGGAAC7AA7A7ACC7CC7AAG77ACC7CAAGC77C77C77AA77CC7 

TCTCCTCCCTTCCTTCATTGCCAAGCAAACAACCACCTGTTTTCTGTCAC 

TATAGATTAGTT7ACATTTTGTGGGTTTTTTTTTTTTTTGAGACAAGGTC 

7CAC7C7G77GCCCAGGA7GGAG7GCAG7GG7GCGA7CA7AGC7CA77GC 

AGC C TTGAAC7 C CTAG7TTCAAGTGGTC CT CCCACTT CAGC CT CCTGAGT 

ACCTGGGAC7ACAGGGG7ACACCACCACAAC7GGC77AAAAAA777777A 

AA7AAAAA7GGGG7C77G77ATG777C7CAGGC7GG7C7CGAAC7CC7CG 

CC7CAAGCAGCCC7CCC7CCT7GGCC7CCCAAAT7G77GGGA77ACAGGC 

A7GAG7CA7GAC7CC7GGCC7AG7T7ACA777TC7AGAG7777G7A7AAA 

7GGAAACA7ACAGAA7G7A7T77T77GCGGAG7GGGGGAG7GT77C7A77 

7C7TTC777C7' I T777CTr77TrTrrrrrT777777GAGACGGAG7C7CG 

C7C7G7C7G77GCCCAGGC7GGAG7GCAG7GG7GCGA7C7CGGC7CACCG 

CAAGC7CCACC7CCCGGG77CAAGCAA77C7CC7GCC7CAGCC7CC7GAG 

77777 GG7AGAGACGGGG777 CAC CA7G77AGC CAGGA7GG7C7CGA7C7 

CC7GAC C7 CG7GA7C7GCC CG C77CGGCC7C C C7AAG7G C7GGGA77ACA 

GGCG7GAGCCACCG7GCCCGGCCCAAG7G777C7A777C77AACCAGC77 

7CA7GCAA7C7777777A7777ACCA7C7C7G7GA7CCCAC7CCCAAAGG 

7AC7AGA7G7CGA77GG7CC77AGGA7CAGC7ACCA777GCCCAAC7GC7 

77CCAGCC77CCAAAAA7T77777C7777T77C77AAAGA7AC7CC7G7G 

7GAGGC7CAGAAC7C77GAATTGC7AC7GCAAA7A7GAAC7CGG7GA7G7 

GAA7GCCAGGGAA77GCC7GA77GA7CAAAGAAA7G7A7CCCC77C7CCC 

7CAC7C77GC7G7C77C7CAT77G7777CCCCA7CC77G7GGA77CG7GA 

ATT7AAA7A7CCC777AATG77A7AA7A77T7AA7GGCG777GGCGAAAA 

G7ACAGAA77AGG7GCAAGAG7GCA7AGC7G77A7777777777GGCC7C 

7GAGAC7G77CA7A7A7GCAAG77A777AACAGAAAG77C7GCAG7GACC 

7GAGA7G7CAGGGGGG7C7GA7AGAG7ACG777GAAGGCAG77AC7GGAA 

AAAAA7AA7GCCA777C7GG777G7AC7TCGG7AAG77CAGA7GACCCAA 

7A7A77G777ACA7G7GGCA77CAG7AAAAAAG7AGC77CCCC7CCC777 

C77C77CC7777C7CC777CC7GC77C7A7AAAGCA7C7GC777GGGAAA 

C77C77AGGAGGAGAGC77GCCAGCCCG7GGG7AA7GGAGAGG7C77GCA 

GAGA7AAAAGAGA7GC7CCCAC7CAA7GCAGGA7GG7G7GGAGG7AAA7G 

GGGA7ACG7C7GGCA7CAC7CAGGAA7GGGCC77CC7GGCAGGGAAGAGA 

AGGGAGGGGAAAGAGGAAGGGAGTCAAAGA7GAA77GC7GAA7ACGGGGA 

T7 C CAGGG C C7GGAGC CAGGAAGAGAAC777GGGAGG7G7GAACC7GGAG 

GGCA7CAGC7GA7GAGGAGCAGCCTGAAG7CCGGGGAGGACC7G7T7T7G 

G7GGCCAGGAAGAAAG7GCCT7CCACACACAGGGAGGCCACAAGGC7GAT 

GGGC7GGGGG77GGAAGGACAGCCC7AGGACAGGC77GGGAAGCAGGC7C 

AGG7AGGGAC7GCGAGGTTCTTGTTGAGTCTT7T7CA77CCTGG7C7TAG 

AAAA7AGAA7CCAAGGCCTCTTGAGAG7GGAAGG7GGG77GGGAGGAGGG 

CAGA7GGGGC77AGGCCCAGGACACCCGTAGAGC7AC7GCCCAGCTGTC7 

C7CAGGGAC7C7GC7GAGGTCAC7CCAAGGA7CA77C77AGCC77 GCTA G 

ACAG7AC7GACAGAGGGAACCGTAG7A7CGCACCCAC7TCC77C7C777C 

AA7GAAAG777AAAGG7CACCA777C C7CTGGCAAAGGAAG77CCACAAA 

7A77CCA777CCGG7C77AGAAACAGCAAGG7A7CAAGCAA77GCAAAC7 

7CC7G7GC7GGGGAA77CCCAAGGAAG7AGGGGCAGAG77C7GGTGGAGA 

CAAAG7GAA77CCGAG7GATTAG7CAG7AGCAG7AGCAG7AGCAG7AGCA 

G7AGCAG7AGCAG7AGCAG7AGCAG7AGCAG7AGCAG7AGCAG7AGCAGC 

AGCAGAACCAGAA777CCCCGCACG7G7C7CAGGC7C7CA777GCCAAC7 

CAG7C7C7AAG7A77777A77GGCAGGAAAAA7AAAA7AGC7A7GAG7GA 

AA7AA77CA77AGACC7GAGCC7CCA7CAA7TT7G7G777AAAGGCC7GA 

C7C7C777ACC777CCC7GGGA7GGAAGA7GCAAA7G77CC7GATG7CAC 

7 37 CAAAAAAGAAGAAC CAG7GGG7A7A77G7A7G C77GAG77C CAGC CA 

777G7CACAA7AGA7AGAGA7GAC7GCCA7G7G7G7AGAC777C7A7AGA 
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CTGTGTGCTAAACCCGACCTGCCACTTCCAAGGAGTAGATGAGGAATG^C 
CATGGTTCTGGGGAGCCCTACCCCAATTTGGGGCAGACATTCCAAAGCTC 
ATTTTCTGTGGAGGGGGTTGATGGT7AAAGGACGGCCTGGGAGTAACTCG 
7CTGTACTAGGGCCCAGGAGAGTTACATGCTGCTTCCCATGTTATTCATC 
ATT C C C C CATGTGAATAGCTATGGCGTGAGGTCCAAGGTTAGGGC CTTC 
TACCATAAATG GGGGAA TAAAATTCCCCTACCAGCCTGAGAAGTTTCTGT 
TAJAAAGAGGCTTTri^TTGCGGGGGTGGGGGAGCAAGCCGACTAATGT 
GTTATTTCCATACGGTTTGTTTTAAAATGTAGATGTCATATGCAGGAGAG 
GTGGTGTAGTGAGTCACAACGGGATTAGAAGGACCAGTCCGAAAAGCAGA 
AGAGGGTCAAGTTCAGGGCACTGAGGACTACTGCATTCAGTGGCGTGAAA 
GGCAGATGGCTGAACAGGAGGGGGACATTACATTGCTTGTTCTCCTTGAG 
CCTCGATTTCCTCATCTAAAAAGAGGGTCATTTATTCACAGAACATTTAT 
TAAACTTGTGCCAGGCACCGTGCCAGGAGCTGGACTAAAAATTAAATCCA 
CCCCTGTGAGCTGCTCTGAAGGCTAAAATATGAAGTATGTAAAAGTAACC 
AAGTGCTGTACACATGCAGCTATTCAATGACTGTGTGGGCATTGCGGCAG 
ATTTTAATTTTCTTTTTTATTTCTTTCTCTTTAGTGAGAGGXGTTGGTTG 
TTATTATTGTCGTCGCTGTAACTGTCTATTTCACTTGCTTT lT ' r GTTGCC 
TCCAGCCCATTCCAGGGCTGTCATCTAAGACACTTCTTATCACCTAAATA 
AC CGGGGAGGCAAAGCGCTTTCTTAAGAGATGGAT CCAGAAGAACAATGC 
TGGTTTTCTGTAGAAAAAGGGGCTGTGGGAAGTAGAGATAAGAAGGGAAT 
TGGCCAAGATGAATGTACAGAGCCTTATTTTTTTTTTATAACACAGCAAG 
ATTAGATACAAAACAGGACAATAGCATCATCTGTTTTTATAACTGGAAAG 
3ACCTCACTTTACAGGTGGGGAAGAATAGAGTGGAGAAGTGAAGAGAATG 
3TCACAGAGTCAATCAGCATGTCTGCGTCAAAGCTGGGATTCCCAATTCA 
GGGCTCTTACTACAGTGACGTATGGCTAATATTTTGGCATTGTTTCGGGG 
AAAAGCTGAAGCCCTGATGGTGTACGTCACTCTTGAGATAGTCTGTAGTC 
CAGCAGGGAGGAAAGCAAGGAAGGGAGGTGGAGGCAGCATTTTTGGGTGT 
AACATTTCGTTCTTGTTTTGTGGCCAAATCATAGTGTGATTGGGACAAGC 
CACTGCCTTTCTCTGAGCCTCCACTTTCTTTTTCTTCTTAAGAGGGAGGG 
AATAGTAGAGTAAAAGTAGTCATTTTATCAAACACCTGCTATTTTGGAGC 
CATATTGCAAGTGGGTTGGGGGTTGAACACTTGGCTTTATTACCCATAGG 
ATTAAATCCAACCTCGATACTGTGGCATTCCCAAACTCCAGTCTAATCTT 
CTTCTCCATCAGCCATGCCCCACGACACCCTGGTCATATCTGATGTTGCC 
CCTTGCACTTGCCCCCTCCTTATCTTTGCTTTCTGACCTACCATATGGCT 
ATTGGTTGAAATTCTCATTTTCCAGGGCCTTGCTTAAATATCATCTCATC 
CATTAAAACTTTCTTGAACCTCCCCTTGCCCTGTTCCTCCCTAATGTCTC 
AAGCCAGAATTTATTTCCTTTTGTGGCCAAGGGACTGGGTTTGTGACCTC 
T CTCACGAGACTTAATATTGAGAC CAAACGTCTTTAGACCT CACCAGCCA 
3AGAGATGAGCATCTATGGAATGCAGGCTTTTGCC7GGACTTGCTGATGC 
AGGGCCTCTGCCTTCCTCCAGGGCCTCTCCTGCTGTTTTAGGAATTTCCC 
TCATGGCACAGTCCATGAGCTCAGGGTCAAGTTCATACATGTTTTTACTT 
CTTCTACTCTGCAAATGGTCTTCTTGAACTCTGAGGGTCCTAAAGCTGCT 
CTGCAGTTTGTGGGGTGAGTAGAAAGGGGCTTTCAAAAGTTGTGCTGTTG 
TTTCCCACCCCAATAGCATGAAACACAAAGATGCTTACAAATAGCTGCCT 
TGCTTTCTAGTCCCAACTTCTCTCTCCTGAGGCTTTAAAACAAGTCCCCT 
AGGTTGAGCTGGACTGGAGTTGTATCCTATCTTCATTATCTGTCTACTCT 
CTTTCTGCTCTCTAGAGAAGATATTATATATGTGTGTATGTATGTGTAAA 
TATATAATATCCATATATAGAACATATATTGTTATATTTACATATACATA 
CATAACATATGC ATGTA TTCATATATACATATGTAGTATCAAAGTTGGAA 
TTAAACTGTATATTTTGTAATTTGCTTTTATTTGCATCTATCACTGTAAA 
ATGAATATTTATCCATACCGTAAGATATTCTTCAATGT A TTTTTT lTrr'r 
TTTGAAACAGGGTCTTGCTTTGTTGCCCAGGCTGGAGTGCAATGACCCGA 
TCTTGGGTCACTGCAGCCTTGACCTCCCCGGCTCAAGTGATCTTCCCACC 
TTAGCCCT CTGA GTAGCTGGGACTAAAGGTGTGTGCCTCCACACCCAGCT 
TTTTAATTTTTTTTGTATT 1 n'TTTTTAAAGACAGGGTTTTGCCACATTG 
CCCAAGCTGGTCTTGAGCTCCTGGGTCCAAGCAATCCTCCCACTTTGGCC 
TCCC AAAG TGCTAAGATTACAAGCATGAGCCACCACACCTGGCCTCAATG 
TAATTTTTAATGGCTGTATAGTATTCCATCATGTGGTTGTACCCAAAATT 
ATTTAAC CAGTCC CCAGTTTATTTCAATTTITTTTTACTATTTTGAATAA 
TGTTTTAGTAAATACCCACAAAATATGTACAATGGCTGGGCTTAGTGGCT 
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C ACC Z CTGTAATC CCAA\-ACTTTGGGAGTCTGAGGCAGGTGGGTCACCt'G 
AGGTCAGGAGTTCGAGACCATCTTGGTTAACATGGTGAAACCCCGTCTCT 
AC CAAAAATACAAAAATTAGC C GGGTGTGG7GGCACACAC CTGTAATC G C 
AG CTACT7GGGAGGCTGAAGTAGGAAAAT CACTT GAACCTAGGAGGCGGA 
3GTT G C AGT GAGC C GAGATCACACT ACTGT ACT C CAGCATGGGCAACAGT 
GAGACTCCATCTCAAAAAAAAAAAAAAAAAAAAAAAAGTACAATTTGTTG 
"ACCTCCCTGATTATTTCirTTAAGTAGAATTTTCTTATAATTTTTTTTA 
"AAGTAAAATTTTGAATCAAGGGAGAAGCACCTGGAGTCCTTCAGATACC 
^ATTGCCAAACTGAACTTTTCTGTTCCAGGTTTACTACATTCAGCCTGAC 
T CAGGGTTT GGGGAGTAGAGGAGGGGGTGGAGGCAGAGGGCCTCTC CCTG 
"CCCCACAGACCTCCCTTGGTGAGGTCCAAGTCTGGACAGGTGGAGTGTG 
GCATTGCACCGTCAGGTCCTGCTTCCTGTAATTCCCCTAAATCCATCCAG 

TGGAGCCTCATTGTTCAAGTC' lT r n il'l 1 ill 1 1 1 1 1 1 1' 1' 1 rAACTCCC 
CTGAAGACGGAGTCTCACTCTGTCGCCCAGGCTGGAGTGCAGTGGCACGA 
^^TTGACTCATTTCAACCTCTGCCTCCCAGGTTCAAGTAATTCTCCTGCC 
TCAGCCTCCTGAGTAGCTGGCACTACAGGCGTGTACCATCACfiCCCGGCT 
AAT^TTTTTTTGTATTTTTAGTAGAGACGGGGTTTCACCATGTTGGCCAG 
GCTGGTCTCGAACTCCTAACCTTGTGATCTACCCGCCTCTGCCTCCCAAA 
G^GCTGGGCTTACAGGTGTGAGCCACCAGGCCTGGCCTCAAGTCTATTTT 
T^AACTCCAGGAGGCCTGGTATTCAGAGGGATTAGGGCTGGCAGAAGGGC 
C^CAAAGCTTTCAAGGCCTGGGGAATAGGCTGCAGCCTGGTTCAGGGTAA 
CCCAAGTGATTTTGGTTCCAAAGGGACAGGAAAAAAAGTGATTGATATGG 
AAGTTGTCAAAGTGCAACTGTCAAGACATTAAAAAATG7AAC C CTTTTAC 
-AATATACAGTAGACTTGTGTTAAATATTTAACTGATTGTAAAAGGAAAA 

AAC C AG A C GCAGTTTTC CCTAC CAT ACTGT CACAACAC CT CAACACTGAG 
TTCTTCTGTGACCTCTAGTCACCGAAATGCTTGGGGATTTCTCCCACCAC 
T AGT C CT C CAGCAGCCGACACCAGTTGGGTGTC CTAATTCACTCCAACAC 
TATCTACCTGGAGTTAGCGTTAGATCCCACAGGTTGAGGGCTCAGTCTCA 
CAAGAC^G C CT CC CACTT CAGGTGC CAGTT AC AAGTGGTAGGTTGTCACC 
TATGCTTCTGACTGATGGCTATAAATCTGGGTTTGCTTCCCTCGGGTTCC 
GTGAATTTGCTAGAGCAGCTCACAGAACTCAGGAAAACACTTAAGTTTAC 
CAGTTTATTCTAAAAGATATTACAAAGGATACAGATGAACACCAGATGAA 
s**r«r*+. * »^r'iv^r'^riAri&&r^r^TnTr^A.GrTTCCATGCCC 




CCCTTCCAAACCCTGTCCTTTTTGGGTTTTTAAGAGTGGCTTTATTACAT 




-C"AATC r "GCCTTGGTCTrTCCTGTGACAAACCCCATCATGAAGCTACT 
GCATTGGGGCTGCCAGCCAGCAGTCATCTATTAGCATGCAAAAGACACTC 
-TATTATTC CAGAGAT7CCAAGGGTTTTTAAAAGCTGTATGTCAGGAAAC 
AGGAGATGAAGAACAAATATATATTTCACAACATCACACTCGTTGGGGGA 
ATTGACAGGATAGCAAAACTGATTAAAGGAGGATAGGAGAGACTGAGATA 
TATATTTCCATATATATATATAGAGAGAGAGAGAGATATTTCCATATATA 
TATATAGATCTAGAGAGAGAGAGAGATAGAGAGAGAAGAGTCTTTCC 

aSS^gggggagcagttccggaggtacagcccggacaggagatgtga 

GAAGATCGTGGTTANTGTTCCCCTGGTCCAGAACCCCrCCAAGTGGGCTT 

aagtaggaagggtggtgagcggcaggtaaacacacgtcaaaggcagtctt 




aatccggggcctttgaggcagaataaagtggtctcaggttgttggcattt 




1"G ACGT GAATGAtiAA i WWUWHiumw J. ««v. w»v- — — - - - - - - - 

AAGTC^-CAGCTGTTAGAATTCCTGAATATAGAATAAAACrGCCACAATG 
3CAAACAT3CATCTAGTACTTACTGTGTGCTGGGTTCTAAGAATTTTGCA 

^a^tgc^taccgactca(k:ttcacactcaccctcctactgtcccc 
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CAGCATAGATAQAAAAAATGGQAATTTA^AAARTar a^«a J 



I;I; U ^ T ^ CA; ^ CTGA ^^GTTTACCACCAACAAGCATTCATTA 
AATGGACrrTTAAATGCAGTTTTTAGGAAGAAOSAAA^S^ 

AAG^TCTGAGATGCAAAAAGGAATTATGAACAAAGAAATTGITAAAA^TA 
TAGG7GAATTAAAAAAACTGCCTGCAT1 & & iwiTi>Ts» i^i. ^VvC^ri- 



TTCTAGGGTCCTTCTATCCTTCTAGAGGGGAGTCAATATATTAATTT^G 

ACCGTCACTTACACAGTGAAAAACTTTAAGGATA^CATAAAAAAATAGA 

AATAGAGAGTATAACTTCrGAAAOVGTCAAGGGAAAAATATGGAATAAGA 
AAACTGACCAAAAAACATCTCAGTCAATCiAAiaia^Aa 



ACAA ~ ATAAAC ^ GTAATCAAATAG '^^ 

II A Il " TTTTTGAGACAGGATCTTGTTCTGTTGCTCAGGCTGGAGTGCAG 
:S^I TGATCT ^ GCTCACTG CAGCCTCCACCTCTTGAGGCTCAAGCTTT 
CTTCCTGCCTTAGCCTCATGAGTAGCTGGGTCCACAGGCACACACCACCA 
AGCCCTGCTAATTTTTGTATTTTTTGTAGAGATGGGGTCTCAC^ATGTTA 
CCAGGCTGGTCTCAAACTCCTGGGCTCAAGCGATCTGCCCCCCTCGGCCT 

^^^ZT^^^^^^^^A^^A^ATGGCA^TAATGAAAAATG^TAAA 
TCCATAATCGCAGAGGGATTTCAACTTACTTCTTTCGATTATGTAAAGGT 

TTAATGAACATATATTGGATATGTACCCAAGAATTAGAGAATACATACTA 

gttttgagtttatcxmaa^tttacaaaaattSgtg^ 

ATAAAAAGTTGCTGTC^CGTAGAATAACAC^CAAACCCCTGAGTCCGGAA 
TTCAAAGCCCTCCACACTCTCCTCTACCTTTGCATCTTTATCCTCCACCA 

CATGTTCTGTCAGCTCAAATCATTCTCTCTGCCTGGAATAACTACTTCAT 
ACATATTCTGCTATTGAATTCTTGTCTTAGCACCCCATCTACTCCAAGAC 
GATGTC CAGTTGGGGTTACTCCCTGTCCCATTTTCTTTGATTACACTTTT 
TTTTTCTACTTCCATTATATTATTGATCACATCTGTGCCACAGTTTTTGA 

c^gtgtctgcttttactcttttctagaccctgagX5ctc^ 

TGGGTCATTTCTTTTTTATTTGCTCATTCCTCATGGCACAGTGAGTGC^^ 

aataaatggctattgactgaaattaaactgtatctaaatggacatattcc 



win f 
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-~GCTAATGAGGCTAACACTTGGAGAAi i L i iAi ^ 1 ^:"t^^-r^ 
rTAGG^ ATCAGATATCTGGAGATATCTTCATAAACGATGGCTTTGGTT 
t^AGAAGAGT^TTCTGAAGCTACTATTTC 

l-CCAT^TG^GTAAC3GCCTTrCCTAAAATATC3ACGGTAAAATCTACG 

^ggaaaaaSctattcttc^ 

-?^A^^T^GTGATTAWUTGTAGTTTTCTAAGAAGAT^ 
::TS^^^r^r.TTTr_ACAGTGTTGGAAATCACTCATCCCTCA 



TGTG 
TC 



^m^PVaar^GAGGATAATTCTGAGACTTCTAGGTTGATCCACTGAA 
^^^aTT^TCAACA^CACAAGGAAACTAGGGGAATGAGAAGGC^TACT 

^ir^^TACCTGGAACTCAGTACCCTATGACTGGAGCAAGTCTCTGTC 
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T ACCAAAATAAATTCCAAATAAATTAAAGGCTTAAACATGAGAAAGTTAA 




AATT C AGGG77 CAAAGGTCTTTTCTTTGTCCTTTTCTTTTCCCTTTC CCT 
^^^^w^j^xTTCTTTCTCTTTCTTTCTTTCTTTCTTTCTTTCTTTCT 
^ T X T w£~~TATCTTTCTTCCTTCC^ 

^GGT^GCTTGCTTTCTTTCTTTCCTTCCTTCCTTCTTTCTCTCCCTTTCT 
^TC^C^"^CTTTCTCTTTCTTTCTTTCTTTCTTTCTTTCTTTCTTTCTTT 

^J^ f CTTTCTACTTTCTTTCTTCTCTTTCCTTCCTTCCATCTTTCT 

^TCT^TC^TCTTTCTTTTCTTTCTCTCTTTCTCTCTCTCTCTCTCTCTT 
'CTTT^^TTTCTGGTGAGACAGGGTCTCATTCTGTCACTCAGACTGGAG 
AACAGTCGCATGAACATGGCTCACAGCAGCCTTGACCTCCTGGGTTCAAG 
CAATT^TCCTGCCTCAGTCTCTCAAGTAGCTGAGACCACAGGCACCCACC 
ACCAAACCTGGCTAATTTTTGTATTTTTAGTAGAGATGGGGTTTCACCAC 




AATCCTGGTLriLiLJiAM^ • - — 

GCT^^ACAGTGCACTTAAAAAAAAGTTTGTTTTTTTTTTAAAAGACATT 

TCTACATGTCTAAACTACAATCTTCCTTGAAAACCCAAGAGTAGCTTCTG 

TT3CAATAGCTAGTCAAAAATATAATAGTCAAAAAAATCAGGTAACACAA 

ACAAACGCAAGCAGTTTAAGAGCTGAAATGAACTTGTCTGTTTACACTC 

rAGGGATTCCATAAGGAAAAATAGAAGTrrCTCCCTAAAAGGGAGCCTGG 

-ACC-^C'^CATTTTCTTTAAGGAACCCCAGGCTATTATAAACTATTTTA 

GGGCTCT^TGCAGCAGACGGTGCAAGAGAAAGGAGAGACAGCAGAAGTA 

AATGAAGAAAACAC3AATCCAGTCAACAGAGAAGAAAAA^COTGCTCA 

AAAAAAGGCAAGTTCCTAGGAAAGAAAAAAAAAACATGAGGGCTATTTAA 

A^A^AAAG^CGCATACATACACATGCACACATCTrGGATGTTAGCTTTr 

ACTAAGCTGACTTTTAACTATTGAGGTCCTTTAAAATAA^CTT^AAAA 

TCT^CTACGATATTTCAGCTAGGACAAATTGCTGCTATTTCAGCACTAC 

CAAGTATCAAACCAGAAAAGGCTTGATTTAGGAACCAAACCCAGGCTGTC 

GTGGTAGGAAAAAAGGCAGAACGTTAGCTATGGAACCCACAGCATGGGGC 

AACAGC CATTGCTCTTTCAGTATGGCCTGGCTAGCAAAAAGGTGGCCTTG 

^ATGTAAATAAAGCCCGTTTGGTGGTCAAAATGAAACATCTTTTCCTTT 

^^^^ttttgctggccgTTTTTTCCCCCACCATACCACGTTTGTGT 

GTGTGGGAGGGTGGGAATTTAGCCACTrCAGAGGCCTCATTCCCCATAAT 



rr&CLXAGCCGCAGACAAAATCCCTCAGACACCAAATTAAAGAAGGAAGGG 
TC^CGAGTGTGCAATTCCTGTCCCTTTTAA€GGCTCACAACTCTAAGGC 





TiGGiTOAQL^i^aoii^ rTr^r.A aTrrrTTCGTGGTTAC 

GATTTGT 
CAACATC 
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acaggtaggtgagagacaaacaattgcattcttctg^gtctgatSgc 
ctttccaaaggaggcaatcagatatgcatttatcacagtgag^aSS 

-^^^ A ^°^^52I^ CTATO ^ GC ^GTACK3ACfGGTGGG 
^I^? A ° A ^S^GGGGGGTGGTGGTCCTGTGCCAATGGTAGCAAC 
^-"^ A ^ CTGAGT ^CTCATTTCCCAACCTCCTCTAGCAGCCCCA 




ATGATTCCAAGAGTCACCAGACCATGGGGGTGGCCAAAGGGAGTGCCTCT 
TCTCACCTCCTACTCTTAATTCCTTGTACTCAAGATAATAAGTTCCCAGA 
AGAGAAGTACCCATATTTAATTCATCTGTGTCTTCCTAGCAGTACTAAAA 
ATATTATATGAAAGGTATCAAACCTTTGAGAATG7GTGCTGCTAAATTGT 
TAAGGATGCTGGAAAACTCAAGACGTCCCTGATCCTGAGCCTGAGTATGA 
G ^" G I° G I GAGCC ^ TG ^GGTCTCCATTCAGACAAAGGCCTCAGGGA 
ACGGAiGAGACCTAGGGACAGAGATGCATGCTGGAGCAGCATTCCCCATC 
C CT AC T G C AG CT CAGGC CAGCTGACTG CTTTATGAGTAAAC GTT AC CAGG 

GAACACTTTGCAGTCTTAACACACATGCCCACCTGTGACCACTGATCCC7 

gttgggtgaccactcsacatca(3a<*ttcg^^ 

CTATCCTCATTAGGAAGGAAAGGAAGGAGGAGGGAGGAGGGCAAACGAAT 
CTTTCCTGCTTGTCAACCACGTCCATCTCTGTTAGGTGATTTCCCATGTG 
I? A HF GmATCmATAATAACTCTGA ^GOTAGGTCTTGATGTCCA 
CATTTTGAACATGAGGACATCCAGCCAGGAAGTTGAGTTCTGGGGACA.TA 
GCTGAGAGGGCAAAGCTACATATAAACCCCTC^TTCTTTTTTCTOTCTTA 
TCCACTGAGTGCCCCCTGCAATCCACCAGCCCATTTGTGAAGTGCATACT 
ATAGGTAAGTTGGCACAGGAGGAGTGGATGTGGGCGATTTTGTCACAGCT 
CTCCAGGAACTTACACACTGGTGAGGAGGGCCAGGTATGTTCCTGACCAG 
TCACAATCAAAGCAACCTCCTACTAATCAGGGAGGCTTGGTACCTGGGGA 
ATGCTATGTTGAAAGGT7CTTTTCTGGGTTTTAAAATGATGGGTCTATTT 
CCTTATTCTTAAGATTGCTTTTTTTCTGGCTAGAACTTAAAAGAAATCTT 




rAGTGACAGGACTGACTGAAGTTCTG^ x waauh^al*tuuuc 

TGGGGCTjACACCGAGATTTCAGAGTCCTGGGTTTCGCCGAGAGCTCAGT 



WW 
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GTAGTGCCATGCCCTCTLTCCACCTGAACGCCCAGTGTGGGCAGGAACAA 

CTGCAGCTAGAAGTCTGGCACTTACGCTGGGGTCTAAGACC7GCCTGATC 

TGCTAACTAGTCTTGTCCCTTGGCTATAAACTGACGTTGGCACCTGGCCA 

GAAAGAT GAGCAAGAGATCT CTGACACAC CTTTAAGTCC CTGTGGAGTAG 

GATTATGTTGGGGAAGGTCATTCTCTTGACTGAGCAGCAATTTCAGAAGG 

AAGTCCCATGCCGAAGTGAGAGAAGGCAGGGAATCCTGCCTAGTCAGCTA 

GAS CAAAACAGTCTGCAGGACGGGACCCAGGGATGTGATCCTCC CATC CA 

AAGGCACTGAACTAAATGACTAAAATACTTTCCAGGGCTCACGTTCTTTG 

AAGAATGGGGACTAAAACTAAGACAGGAGCCAGCAAGTGAGGACTTGGAA 

GGAGATGGCTCATCTGATCAGCCTCCACTCAACAATTTTAATCATCCACA 

CTGGCATGGGGACACAATATGAATAAGTTGACAGGGACCTACTCTGATTA 

AGCAGTGGGCTAGTGCAGAGACCTGTCAGTCAAGAGTGGACAGGAGATGA 

TTTCAGACAGTGAGAACAAAATTAACAGAGTCATGTGCTAAAGGGTGGCT 

GGAACTACAGAGGAGTTTAAGACTCAAGAGGTCTGGCTGGGCGCGGTGGC 

TCATGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGATCACAA 

GGTGAGGAGATCAAGACCATCCTGGCTAACGCAGTGAAACCG€ATCTCTA 

CTAAAAATACAAAATATTAGCCAGGCGTGGTGGCGGGCACCTGTAGTCCC 

AGCTACTCGGGAGGCTGAGGCAAGAGAATGGCGTGAACCCGGGAGGCAGA 

GCTTGCAGTGAGCCAAGATTGCGCCACTGCCCTCCAGCCTGGGCGACAGA 

GCGAGACTCCGTCTCAAAAAAAAAAAAAAGACTTGAGGGAGTTGTTTATT 

TTTGTTTTCTTTTTAAGACAGGGTCTTTGTTGGGCGCGGTAGCTCACGCC 

TGTAGTCCCAGCACTTTGGAAGGCTGAGGTGGAAAGATCTCTTGAGCCCA 

GGAGTTTGAGGCCACTCTGGGCAACATAGCAAGACACCGTCTCTACAAAA 

AATGTGCAGGTTGAGGCTGCAGTGAGCAGAAAAACAC C G CTGC ACTCT AG 

CCTGGATGACAGAGCGAGACCCTGTCTCGGAAAAAAAAAGAAAAAAGACA 

GGGTCTCGCTGTGTCACACAGGCTGGAATGCAATGGTGCAATCATGGTTC 

ACTACAGCCTGGAACTCCTGAGCTCAAGCAATTCTCCTACCTTGGCCTAC 

CAAAGTTCTAGGACTACAGGTGTGAGCCACCACACGTGGCCTCAGGAGAG 

ATCTTAATAATAAAAGGACAAATTGCCTTGCATCCCTTAGGGGCAGGATT 

GACACATCCAAGGATCAGGCAGAAAGCCTGTGCGGAGTGGGATGAGCAAA 

GAGAAAGGCTGAGAGTTGTGAAGAGGGAGATGCAGTGCCAGCTAGGACAG 

GCCTTTTTGGGCTATGGGAGGTTTTCAGAGGAGACCCCACCTAAACTAAC 

CCATAAC ATTGCAGTGGGGAC CTGTTGAAGTCATGGACTACTACCTGAAA 

G C CAG AG AAAT GGGAGGAGC CTTT C CTCTG AGGAGGG A CT CT AGT C CAT A 

GGTAT CTTGC CAC CAAATACATGGACAGGCC CTGGGGGAAGATGGTGGTA 

GCCCAGCTGGAGGAAAACCATTTGCCACCTGAACTAGCCCAGGGTAAGCC 

AC C CAGGCACTGAGGGTGCACACC CATGCATGCAC ACACAGAAT CACACT 

C CTTC CTATTATTCCTCAATTCAGGGGTCTCAACACC CATTTTTTTTGTT 

TTTGGGGTTTTTTTTACATGTTTACATTTTATTTATTTATTATTTTGTGA 

CAGGGTCCCACTCTGTTGCCCAGGCTGGAGCACAGTGCAGTCGTGCAATC 

ATATTAGATTGGTGCAAAAGTAATCACGGTTTTTGTCATTAAAAGTTTTG 

CCATTACTTTTAATGATAAAAACCACGATTACTTTTGACGCAACTTAAAA 

GCTCACTGCAGCCTCAAAATTCCTGGTCTCAGGGAATCCTCCTGCCTCAG 

CTTCCTGAATAGCTGGGACTACAGGCACATGCAATCCTACCTGGCTAATT 

TTTTAAAAA l ' T ' I ' l"! I T TGTAAAGATAGAAACTCATTTTGTTGTCCAGGCT 

GGTTTCAAACTCTTGTCTTTGTGCCTCCCTCTGCCCTGTGCAAGACCTTC 

TGGATGCCCACTAATGAAGACTTCCAGGGAGAGGAAAAGTAAACATAGGT 

C C CTGAT CAAGGGACCAGGGTTTAT CGAC CACAAACAGCATGC C CAGATT 

CCACTGGCAGTCCTAGAGGTCGCATTTGCCCCAAGTGTGTGTGGAAGGCC 

TCTCCCTAGCAGTTGGTTTATACACCAGCCACAGCACAGCATATTCTCTT 

AAATTGTGAACATTTGCAAAAACTCCTTGAGGACAACTATCATGTCTTGT 

GTACTTTTGTTTTGTTTCCCTTCCCCTATGTACACGCGCGCGCATGCACT 

CATGCACGCACGCGCGCGCGCACACACACACACACACCCCTCAAACTGAA 

TGCCTGGTGTGCTGAATGGATGAATGGCTAATGTAAGTCATTCTAAAAGC 

TACTTTCTTTGGCATACCATCACCTTTGATTTCATCTTTCTGGAACTCCT 

ATGTT CC C AGATGAATTTGGAAAGC C CT CAGGAAACATTTCAAAATTGCT 

ATATGGGAGAAATGGGAGGGTCTCTCTAGAAATTTACCTGCCACAGGTAT 

^TCTGGTAAGACACAGCAAAGGTGGCACCACCCATTCCTCGTTACAATGT 

CAATGC CAGTCACCTTC CTGT CCCATAAAACTTTATTAAAGGTGCAGAAT 

7 CC CATGGAAGCAGGTGGACAC CATCTGCTTCCAGCCAGCC AGGGGAGCA 
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AGGTGTCCACTGTGCCI'TTGTGGCAGGAACTGCGCTTCTCTACTC m ^CCA 
CrTTGAGGCCTCTGGGGCTGGCCTGCTGCCTCCTCATTGACAAGGCTGC* 
TACTGAGCAGTTCATTCTGAGCTGGACATAGTGCTTCTGGTGAGTCTCTA 
CrTCTATTTAACCCAAAGATATTCTTTCCTAAGGAAACGCTTTCCTGTCG 



Ax^L.Ui. 1 i^^iUCiATGATATTCACGGAGCACCACCATGAGCCAGTCATG 
GAGGTGAACAGTATATGCCAGCCCTGAATCAGGTGCATTGACAGCAAGGG 




w — ™ ™ *w wt.^ ^w\v.ua\-l^l 1 1 tjVi<aAAAGGACTTGTTATAGAGG 
GGGTGAAGCATGAGCTGAGTCTTGAAAGACTAGAAATTAGCCAAACTACA 
AGGAGGAGAAGGAGTTTCCAGTCAGGAAGAACAGGTTATGCAAAAGCACA 




TTGCCACACTCAGGAGCTTGAACTTATCCACAAAGGAGGTGTGGAACCAG 
TAATGAATGGGTTTTGTGCAAGGGCTTCATGTCACCAGATTTGCTTTTTG 
GAGATACTTCTGTGGCTGATATGTGAGGAAGGGATGGAGGAAGTTTCCGT 
GGCAATCAGGAAAACCAATTAGCAGATGATTCAAATGGCCTAGGGGAAAA 
GGGAGGAGGACTTGGACTACCATGCAGCAGCAGAAATGGAGAGAAATAAC 
AGATCCCAGGCACTCAGGAAGCGCTCAGAATGAGCCCTTCAAAGAACTTA 
TGGTAGGTGATGGATGGATGGAGTGTGAGTCCTGGGATAGCATTGCCTGG 
GAAAATACTTTCTAGTTGAGACAGGGAAGTGGGC CAGCAGAAATG&AGGG 
CTTCT7 CTTTTTGCTTTAAATACTTTTATAATATTTGGAACTTTGAAAAT 
GAGCAGATATATTAGCAAAAAGCCTAAAAGGGATATTTTTGAAATCACTG 
CTAGTTCTAACATATAACTTTCAGCTTGCACACATCATCAATTAACT T TG 
ATAGCGCCTTTCTGAAACTATCATCCCAAATAGCAATCCTTGTAAAAACC 




TGGAAAATTCTGGAATCTCTATTCATGGTATTAAAGTTTTGCCGTCACAC 

AAAAGTTTAACACCTTTACACAATCAGACTTCCTaVTTTTACATTGCTCG 

GTAATTAGAGGAAATCAGTCAC CCAGAGCCTGGGTCCTAGACTTGACAAA 

ATGCACCCAACAAATCCTGAGTGGCCTTGCTGAGGACTTCTCCCAGAAGA 

TAGAAAACTCAGTTCCAGCCAACAAGGGGGAAGCAGCTGAAGAAGTGAAA 

TTAACAAAGTCCTGGAAGGAAATGACCAAATCATCTTTGATTGTGTAATA 

ACCAGAGAGTAGAATACAGCTACGACAGACATTTTGGGAGAGAAGCATTT 

TATCATAGCTTTTAGAAGAGAATATTTTTCAGCATCATAAGCACACAATT 

CCAAGACAGATACTTTCAAGGGATTGTTTTGACG 

>Concig53 

ATGTTNNGG7TT7GGGACCCCATTCAAACT7CATGTT 

ZAATGTTGAGCGAGGTCCTGTGGGAGGGTGATTGGATCATGGGGGTGGGT 

TCTCCCTTGCTGTTCTCAATGATAGTGAGTGAGTTCTCACAAGACCTGGT 

TATTTGAAAGTGTGTAGCACCTCTCCCCTTCATTCTCTCACTCGTCACTG 

CTCCGCCATAGTAAGATGTGTGTGTTTCCCCTTTGCCTTCCGCCATGATT 

GTAAGTTTCCTGAAGCCTCCCAGCTATGCTTCCTGTACAGCCTGTAGAAC 

T GT G AAT CAGTT AGAC CT CTTTTCTTCATAAATTACC CAGTCTCAGGTCA 

TTCTTTATAGCAGTGTGAGAGTGGATGAATATAGTGCCATATGTTTGTAT 




CAGCGAGACCTTGTTTCCAAAAAAAAAAAACCCAAACTGTGTAAAATGTG 
TTCATAAAAGTGTCTTGCTCCCACACCTGTCCCTATATATCTTATTCCTC 
AGCCTCCGACAACTACTTTATTCATTTCTTATGTATCTTCCAGAATCAAA 
AAAAAAAAATCAAATACAAGCACAGTGGAATGTATTGCCCTTCTTCCCCT 
CCC TTTTG TTACATCAGAGTTAGCATATCATAAATACGGTCTGCATTTTC 
TTCTTTTTCAGCTATCAGCATGTTTTGGAGAGGATTTCATATTCGTGCAG 
ACAGCATGTATTAGTCAGTCCTTGCATTGCTATAAGGAAATACCTGAGAC 
TGCATAATTTATAAAGAAAAGAGGTTTAATTGGCTCACAGCTTCGCAGGC 
TGTTCCACAGGAAGCATGGCAGCATCTGCTTCTGGGGAGGCCTTAGGAAG 
CTTTTACTCATGCAGAAGACAAAGCGGGAGTGGATGTCTTATATGGCAGG 
AGCAGGACTGAGAGAGAGAGAGAGAGAGAGAAAGGATGCCACATACTTTT 
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CACCAGGC CC CACCTCCAACATCGGGGATTACAATTTGACATGAGATTTG 

GGCTGGGACACAGAACCAAACAATACCAGAGTGCTTTCTCATTCTTTTCT 

ATAGCTGCCTAGTATTCTATGTCCTTTACTTCATTTAGGCAGTCTCTTGT 

TGATAGACACTTGGGTTACTTCCAATTTTTCCTATTACAAATGATGTGCA 

ATGAATAATTTTGATCATTTTCCATTTCACATGGGTTATGTCCATCTGTG 

GGATAAATC7C CAGGAGTGAAATTGCTGGAT CAAAGGGGAAGTGCACTTG 

7GATTTT CATAGTTAGCAAATTTTGTTCTATAAGGGTCATAT CAATTTAT 

AGTCCCACGCGTAATATTTAACAGTGGGGATTTCCCGACAGTTTGACCAA 

CAAGGTCTGTTGTTAAACTTTTGATTTTTGTCAATCTGATGGGAAAATAC 

TAGTATCT CAAAGTGCTTTTAATTTGACTTTCTTATTACAATGTTAAGCA 

TCATTTTACTCTGCCCAAGATCAAATAGTATTT TCTTTT CTGTGAACAGA 

CTGTTAAGATCCCTTGCCTCTTGTTTTGCTGGATTTTTGTTCTTTTTTTT 

CAAATGTTTTGAGGCAGTTCTTTACATGTGAAACAAGTTATCTCTTTATC 

TGGGGTGTGAGTTACAACTACTTTTCCTCTGGCTTGTTTTGCGCTTTGAC 

TTTGCTTCTGGTGATTCCCGCAATTCTGAAAGTGTACTTTTTGCATCATT 

CATTCTTATACACCCATGCTCTTGTTCACGCTGGTTCCTCTACCTGAGGG 

CTTTTTCTTTTCTTTTCTATCTGGGAACATTTTTTAGAGACAGGGTCTCA 

CTCTGTCATCCACGCTGGAGTGCAATGGTGCGATCACAGCTCACTGCAGT 

CTTGAACTTCTGGGCTCAAGCAATCCTCCAGTGTCAGCTTCCCAAGTAGC 

TAGGACTACAGGTGCATGCCAGCATGCCTGGCTGATTGTTTTATTTATTT 

ATTTATTTTTTGTAGAGATGGGAGTCTCACTATGTTGCCCAGGCTGGTCT 

TGAACTCCTGGGCTCAAGCGATCTTTCTGCCCCTGCCACCCAAAGTGCTG 

GGATTACAGGCGTAAGCCACCATGCCCAGCCCATGTGTGGAAATCTTCTG 

TTTATCCCTTTAGGCTTGATTCTTATGTCGTTCTCCTCCCTCCTTCCTGG 

CTACTCCTCTTG7TCTTTATCTTACTCTACTTGTCATGTTACCTTGTTTC 

TGCTTATAACTAGCTGCCTCTCCTATCTGAGGAGGGACTTGTGACTGTTC 

TCATCTCTGTACTCCCAGGTCCTAGTACATAGCGCTTGCTCAACAGATGT 

TTGGTGCATTGATAGATAAATCAATGGTAGCTGTTAATACCAGTCCTGAC 

TCCCTGCAGTGCTTCAGCTGATCCTGTTCCAGATGTGCACTGAATATCTT 

TCTGTTGAACAACAGAAATAAAGGGGATGGGTGAGGAGGATAGTCTTCGG 

TGGCCAAGGATATTTGTAGGTACTTTGCAGCACTCAGCAATGAGGAGTGG 

GCTTTAGTCCCCCAAGAACTCTCACAGCCCTGTTTGTCTTTACTGTTCAG 

TGTCAAATCCAAGACAAGTCAATGATCAGGAAAGACCTTTTTTTTTCTTC 

AGTGAAGTTTATTTCAGAACCATTGAACAGTATGATATTTGCTCATTTAT 

AAATATTC C CATTTAAATAATCTGAGCTTATATATTTTCAGTCTTAATTA 

AAGGAGTTGATTTAAAGAGAGCACACCAGTCCAAATTGAATTGATTCCAT 

AGCTATTAAAAACTAGGCTCTTTTACAGACACTGCTACTTCTTGCCCCCT 

TT G AAT AAATT AGAC CAATGAATAAAACAAACAAACAAAT AAAT AAAT AA 

ATAGGGAAGCGGTTGCTCATCAGAATGTGGGAGCGAATGACAGAGGGTTT 

CTTAGAAC CAAATGTGGC CGTGGTTTCTGTCAGGCGGGCTTTAAGTGAGT 

AGGAGAGG7GAGAGAGGCCTGGCTCAACAAAAGGGCTGGGGATTGGCCCT 

GAAAGGAGAGAGCTGACTGTCCTGGCTGATGGACAGGAGATCCTCTTAGC 

ACTACCCTAAGGCAGGCAGTTGGGCATTGGTGTAGACAACAGGAAAGTCC 

AGGCTATAGCCGTACTCAAAAACCTTTCTGTTCCCTTTCTGCCAG CCCTA 

GGGATTGAGTCCACATTCAGCACAGGACTCTCTGGGTACAGCTCTCTTTA 

GGAAGACACAAATTGCATGGTGAAGTCAGTTATATCCTGGCCGCCTTTGG 

T CC CT C C CAGGAAGACGGGCATGTTTTCTGCTTGAGAGGTGCTGATGTAC 

CAGTTGGGGAACTGGGCAGACTCAAATTCCAGCTTGTTATTGATTTCTAT 

CTTGTTGAAGACAAATCGCTTTTCCATCTTCTTCTTTGGGTAATTTTTGG 

GATCTACACTCTGCAGCGAAAGAGAAAGAAGAATTTTTGTGGGGCAAGGG 

ACAAAAATGCTATGGGAAAGATGTTCTTTGGGTTGGCCAGAAAGGAAACT 

GACGAGCAGGTCACATGATCAGGAGCCACACTCCTGAG TTGTAA CTGGGC 

CCCCAACTTTCTGTGTGATTATTAAAAGAGCCCTTCTT CTTTTCTAAAAC 

TTAGTGCCAAATGCTGAGGAGCATAATGTAGGTGAGAATTTTTTTTTTTT 

GGGGGGGTGAAAATTAAGCTAGAGCTTCTTGAAGTACCTAGTTTCCAGGG 

GCTTTTTATTGTATTTTTCCTTATGGTCCTAGAATGACATCAACTTGGAA 

ATGAAGCTTTTGCTGAGAAAGCTGGAGGTGATAGTGGTGGTGATTTTGGG 

AGTGGAGTGGACGTGATAATGGGACCCTTTAAGTCATCTATTTCCCAAGG 

"GTCTATCAAATGAGAGCAGCCCTAACAATATATAATCTGTTGGGGTTG7 

AACTATGGTAGGACATAATAACATCGGCAAAATGATTTAATTTTCTGCAG 
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v. . 1 i laa i u i ^AACKJAAAGGAAAGAGGTGAGGCTGTACT^CTGCCAG 
AAATCAGGCACCAGAACTGTTTCAGGAACAGAGAGTAGCCCATGGGAAGA 
AACTGGGAGAGGAGAGGCTGAGCTGGGAAAGTGGCTCCAAAGAGAGACAC 




GCCCGTTACAGCGGAGAGATATAAGTTCCTGCTGGGCGGTAAAATTTCCC 
TACAAGGAACCACCTGGCATTGGGTGGGACGGATGTTGGGGCAAGGGGGG 
AAGACTGGGGAGGGGGATGGACACATTATCGCTCCAGCACTCT7GTTTCA 
GCCT CAACAACAGGAAGAGAGAACC CACAGGCAGTTAGGC CATGTCCATC 
AAATGACCCCATATTGTGGAAGAATTGACATTGCACTATGCCCAAGAGAC 
TTGGGTGGACATGGTCCTGGGAGTGCTTGAGCCGTCTAATTTCTCAGGGT 
CACACTCCTGTTAACAAATGCACTGGCCAGTGCAATCAAATGTGCCATTT 
CTAGGACCAAAGTTTGTATATTCCTTTTTAAT A T 1 1 TT ' ITT CACTTGTGT 
TGATCATTTGCCTTAAATTAACTTTCTACTTTGTTTAAAACATGGAGAAT 
TAGCAAGCTGCCAGGAAGCCAGGCAGGGAAACCAGGATGTTTCCATTTAC 
CTTGTTGCTCCATATCCTGTCCCTGGAGGTGGAGAGCTTTCAGTTCATAT 
GGAC CAGACATCACCAAGCTTTTTTGCTGTGAGTCCCGGAGCGTGCAGTT 
CAGTGATCG7ACAGGTGCATCGTGCACATAAGCCTCGTTATCCCATGTGT 
CGAAGAAGATAGGTTCTGAAATGTGGAGCACATGTTGTTTAGGTATAAAA 
TCAGAAGGGCAGGCCTCGTGAGGCAAGGTGGCAAAATTTGATTTCTTGGA 
GGACACCTGAGCATATACGGTCAAAGTCTGATGACAACACCAGTAGGGAT 
GAAGCTGGGAG7GGGGTGGCTAAGAACACTGGACCTGACACTATTAGACA 
TGGGTTCCAGCTTCAGGTCTATTACTGCTCACTGTGGCCGAGCAACAGAG 
CTACTTAGGTAAAATGGTGATGGTCATAACACTAGCCCACAGGGAGGTTA 
CGAACCTCTGGTGACAATGTAAGTGAAAGGCCCCTGAGAAAGAGTGAGGG 
AGTTGCAAATGTCAGTAGCCATCAAGATCTTCTTTAAGAATAGTTTCCAC 
TAAAGAGATGATTGCTTTGGTTTCCAGCCTTCTTTGTTTTGTCTCCCCGC 
TGGGCCTTCTACCTTTAAAGGGCTTTGGCTCTGGGGGAATTGAGTTGGCT 
GGGGCTTGATGACTTCCAAGAGGACACAAGTGGAGATCTACTGCCTGCTC 
TTGGCTAACTACCTTCTTCAAAGATGAAGGGAAAGAAGGTGCTCAGGTCA 
TTCTCCTGGAAGGTCTGTGGGCAGGGAACCAGCATCTTCCTCAGCTTGTC 
CATGGCCACAACAACTGACGCGGCCTGCCTGAAGCCCTTGCTGTAGTGGT 
GGTCGGAGATTCGTAGCTGGATGCCGCCATCCAGAGGGCAGAGGTCCAGG 
TCCTGGAAGGAGCACTGCGGAGAGAGCGAGGGAGGGAGCCTGGTGAGGTG 
GTCCTGCCAGGAACCATGCTTTGACATCAGAGAGTAGAAAGCTCAGAGAG 
GAGGAAAGGGCTTGAAAGAATCCCGAGCTTCTAAAGATCATCCCTCTCTG 
GGCCAGGCGTGGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAAGCCGA 
GGTGGATGAATCATTTAGGTCAGGACTTCAAAACCAGCCTGGCCAACATG 
GCGAAACCCCTTCTCTACTAAAAATACAAAAATTAGCTGGGTGTGGTGOG 
3TGCACCTGTAATCCTAGCTATTCAGGAGACTGAGGAAGGAGAATCGCTT 
GAACTCAGGAGGTGGAGGATGCAGTAAGCCAAGATTGTACCACTGCACTC 
CAGCCTGGGCAACAGAGTGAGACTCTGTCTCATAAAACAAAACAAAACAA 
AACAAAACAAAATAAAATAAAATAAAATAAAAAGATTATCCCTCTCTGAA 
G CT CAAGGAGGTTAAGGGTGTACT CAAGGGCACACAGCAGGTTAGAGGCA 
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GACTCAAGACTAGAATGi'GGGCTTTCTCarwCACCTTACAGGCTATTCTTTT 

AGAATAAATCCCATTTCTACTTTGTTCATCTTTTTTGTACATGCCCCACC 

TACACCATACATGTATACCTTCTCTATATCTTTTTGTATCCCTAATGCTG 

rCACACTATGATTTGCT7TTTCATGC^GATGACCATAACATTTTCCATTC 

ACCTATGCTCACTCAGCAAGTATTCAATTTTTCTACACTGTTCTTTTTTT 

TCCTTTTTCATAACACTGTCTCATAGGCATTCTGCAAATCCTGTGAGAGT 

ACJTTTTGTGAAATGTTACCACTTTCCTCTTATTCAGAGAAGCTCCGTAT 

TAAGG CTT C AC TGAGGTTGC CTTAAGGCATGATAATGGTTCAAAGGCTTG 

AAAGACAGTTAAAGAGACCTGTAAGTGCACAAAAGAAAGTTGAGCAGGAG 

AGAATTTCTTGCCTGGAGCAGAGCCAAGCTACTGGAAGAGGCAATGGGGG 

CAAAGGCCAGGCAGACAAGCCAATGGGCTCCTCCCACAGCTGCAGCCAAC 

AAGTT AT GCCAGTCTTAAAACTT CTAAAGAAATATGTTTTTAACAAGATT 

GAGGACTGGATTATGAGGCTAGGGGAGGCTATCACAAACTGGAATAAAAT 

AAAGCCAGAGAAAAGTGGCTGCCTTCCAACCTGCACAACTGACCTAGCTA 

GGCTGATGGCTGGGCCACCTAGGAAGGCTACTGAGCATCATATAAAACAG 

AAGGGACAGCAGGAATATAACATGGCTCTTTGTAAGGATGAGTCTGAAAA 

ATGACCATTTGCTGCCCAAATGCCCTTAGCTACAACTGAAAATATTTCAG 

AACTGGAGGTTGCAGGATGCTGGAATCTCAGAGATCATCCAGCTCAGCCC 

^TTATTTTTCAGATGAGGTCCAAAGCGGGTAAAATGACTT GTCAA GGTCA 

^CAGCAAGTGAATGGTTTTCTTTCAAGTCTCAATTCATCTTTTTGTTTA 

TATCAT CTATGTCTTGTTGTTATAAGCTTCAC CCCAGGTAGCAAAAAACT 

ATTCTACTCAAAAGGGGTAGACATATGTTAGTTCTCAAGATCATCTCTTG 

GTTTCAGAGTTTAACTCAAGTGATTGGCATAGGCTGAATCCATCTCTTAA 

AAGGATAATCAAATTTATGTTGAAGACTTGGTTGTCTTCCTACTATGAAA 

TGGGAAACATTATCACTACTCCTCCCCTGTCACCACCAAGTGTGGCCACC 

ACCACCAACGTTAGTGAGTGACTGTGGTGATATGATGACCAAGTGGCCAG 

GTCAGCAAGTGG7GCAGCCTGTGTCTCACTGGAAGAGGTTAAAGTCTTTC 

TAAAACAAAATAC CATGGCATCAAAGTGGC C CAGAACTC CCTTCTTTGAG 

CTTTCCCTGTGTTAGAGCCCTTCCTTGGGTTGGGAGTTAAACCCATAGTC 

TTACCTTCATCTGTTTAGGGCCATCAGCTTCAAAGAACAAGTCATCCTCA 

TTGCCACTGTAATAAAAACAGGGACATGTCTCAATTATGTCTTCTAAACA 

GGTTTATTTTTCCTTCCCTGTGTACAAGACTTGACTGTTCATAAGAAACT 

GCAAACAGCCTGCCTCTCAAAGCTGCCTGAAACACCTGGCAAGTTTCACA 

GTGATATGCGCAGAACAGTCCAGAAGGCAGATTCTAGGCCTGGCAGGTGG 

GCACCCTGGGTGCTCCCTGTTGGATCTTGAGGCCTAACCTCTAGCCCAGC 

AGAGTCAGCTAAAATCTGAGCTCTCCCTCTCCCTCCAAGCCACACTTTGC 

AAAGGGATTCCTTGTATTGTGGGCTTGGAATCTTTTCTCCCCATTTGCCT 

C^GCAGGAAGCCCTTGCAACAACACATCTGGATAGCCTCCAGGTCCCAAG 

GCTGGAGGGACTTGTAATGGGAAAGTAGTCTTTAAATCAGATTTACTTGG 

"ACCCTGTTTGCCACTGAAAGAGGCAATTTAGGGGAAAAATCTGGTCTCC 

AAGCACAGATAACACTCTACTCTTGAAAGAGGAGACCTGCTCATGTTACT 

GGTCTCAGCGTCTCCACTGACCTGTAATAAGCCATCATTTCACTGGCGAG 

CTCAGGTACTTCTGCCATGGCTGCTTCAGACACCTGTGTAAAAAGGAGAA 

AATGAGTGACTTCCCCATGACGGCTACGTTCATGTGTGATTTCTCTCAGC 

ATCCAGTGCATGGCAGTCATGCAAAGAAATGATCTCTGAGTAAATGAATG 

AATGTGTGAAAGAGAAGTCCTTTGGGTCTAGAGAAAAGCATTTGCTAAAC 

CAAACCCCAACTAGCAATGTATTGGCTAGGAGAGCTGGAGCAGAGGCTTT 

GACACTAAC CTTTAGGGTGTCAGCTGTTAGATAAGCAGTATCCATTC CCA 

GAATATTTCCCGAGTCATAAGCATTATATTACACCTGGCATTTTTGCAAA 

AAGCTGAGAGAGGGAGGCAGAGAGGGAAGGAGAGGGAGAGACAGAGAAAG 

AAAGAGAGAGAGAGAGAGAATATGCATACACACAAAGAGGCAGAGAGACA 

GAGAGACTCCCTTAGCACCTAGTTGTAAGGAAGATTAAAGTCATACTTGA 

GCAATGAAGATTGGCTGAAGAGAATCCCAGAGCAGCCTGTTGTGCCTTGT 

GCC^CGAAGAGGTTTGGTATCTGCCAGTTTCTCCCTCGCTGTTTTTATAG 




CT'HVGGGGAGGGGACTAATAAACTGAGATAATTCTCTGGTTCATGGAAGG 
GCAAGGAGTAGCAAACTATGACACATTTTGCAAATGTATCACCATGCAAA 
- ATGCAT f<-.T— '-CCTGACAATCGTTG7GCAGTTGATGTCCACATTAAAA 
-AC-GGArr^CCCACGTTAGAAGAATGTTTAAATTTAGTATATGTGGGA 
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ZAAAGTGGAAGACACACAGATTTATACA * GCACATACTTTTCTTCATTCA 
CTTCTTTGTACTTAAGTTTAGGAATCTTCCCACTTACAGATGGATAAATG 
GGT ACAATG AAGGGC CAATAGC C CT C C CTGT CTGT ATTGAGGGTGTGGGT 



CAGCAAAGTGCCTGCTCCAGGGTACTGCCCAGGGGAGCCCTCCATT T GTA 
GAATGAATGAGAGTCCAGGTTATGAACAGTGCCTGGAGTGTAGGAACACC 
CTCCTTTGCCTCTTTGACAGGTCTGCATCATAACACTTTT T TTTTTT T TT 
TGAGACAGAGTCTCACTCTGTCGCCCAGGCTGGAGTGCAGTGGCACGATC 
TCGGCCCCCTGCAAGTTCCGCCTCCCGGGTTCACACCATTCTCCTGCCTC 
AGCCTCCCCAGCAGCTGGGACTACAGGCACCTGCCGCCACGGCCGGCTAA 
TTTTTTGTATTTTTAGTAGAGACAGGGTTTCACCATGTTAGCCAGGATGG 
TCTCGATCTCCTGACCTTGTGATCTGCCCGCCTCGGCCTCCCAAAGTGTT 
GGGATTACAGGCGTGAGCCACCGTGTCCAGCCTGTAACACTTCTTATAGC 
ACTGAGTTGAAACCTTGCTCCTCCTGGTTCCTCCAGGAAACTGAAATCTT 
TTTGAGCCAAGTCTAGCACAGTGCCTGGCATGTACATTCAGGTGGTAGAG 
rTTGCTGCTTGAATGGGTGAATGGGAATTTGACAGCATTTTTATTCAAAT 
TAGTATGTGCCAGGTATCGTGCTCGCTCTGCATTATCCAAGGGAGTGAGC 
CTCTGTGCAAGTATTTGAGACACGAGGGAAATAGGTTCTACTGTGGGAAA 
AAGAGCATTTCATGGACTTGCTCTCCAAGCAGCCTTCTGATTTTTAAT^ 
GGCTCCCAGTATCTTGATATCAGGAGTCAGTCACAAGAACTCCATCTTTA 
GTAAGTT ATATTTT C CACAGGAAAT CT AAAAGCTGTT CAACATGTTAGTT 
TCCTGTGAATTTGATAAGCCATAATCCATTCCTAACACTGAGCCCTCCTG 
AAATTTGGTGTCTGGTCCTGCAGATAGCTAAAAGCCCTGTCTGGGTGGCC 
TAGGGGACTCCTCTGTTTTGCCTCCACAGGATCCACTTTGCAAATTAACC 
ACTGGTTCTCCCGTTGTAGGAACTGCCACCTTCCTCAGAGCCTGTCTTTC 
T TCC^CCTT CCTTCCTTCCTCTTT C , l , ' riTl 'CTTTCTCTCTCTCTTTCTT 
TCTTTTCriTTCTTTCTrr c riTCTTTCTTTCTlTCTTT^ 
TCTTTCTTTCTTCCTTTCTTTCTCTTTCTCT 

TTTCTTTCTCTCTCCCTCCCTCCCTCTCTCTCTTTCTTTCTTTTTCTTT^ 
TTTTCTCTTTTCTTTCTCTCTTTCTTTCTCCCTCCCTCTCTCTCTTTTTC 
7TTGTCTCTCCCTCCCTTCTCTCTCTCTTTCTCTTTCTCTCTCTCTCTCT 
2CTAGACAGGATCTACCTTTATCCCCCAGGCTGGAGTGCAGTGGTACAA7 
CATGCATTCATTGCATGATCACAGCAGCCTCAAACCCTTCCTCAGAGTC7 
TTATGCGGCAACCAGCAGGGTCTGGAGGGTTGGTGGCTCTGTGAACT CTC 
CTGACAGAACACAGAGATGTCTTTGGTCTGTTGATGTGATTACAAGCTGA 
ACGAAGGAGGATCAAAGCCAGTGACAGGAAGGGAGATATGCAAGGGACCC 
GAGCATCAGCTCTGAGTTAGTCCATTCTGCTTCTGGGACTTGGGATACAG 
GTCAGAAAC CTTGAGCTTCTACTTCTCCATCTTCCAATTGTAGCATCCAG 
GACCTCAGAATCTGCCAGCTAAGAGGAGCCCTAATGATTGTCTGGTGGGA 
TATGGTGGGACCACAGAGATGAAGACATGAATAGCTATTTGAATGTGAAC 
AGCAGACGAAGAAATCAAGGCTAGGAGGGTGGAAGTGACTCATCCAATAG 
CACAGTGTGGTTGAAGCAGCACTAGTATCCAGGTTGCATGAGCCCCTGAT 
GCTTTCGCTCGAGGGAAATTTTGGAGCCATGGGGCAATGCCCCCTGACGT 
AACAGTCTCCACAGTTCTGCCATGTCTCATCCTGGCCCTGTAACCTGGAC 
CCAAATCTGCTACCATCCCATCCATCTCAGGAAGTGAAACCTCTTATGTC 
AAATAGGTTGTGCAACGTATGTATCAGATCCTGTCTTCCCAAGGAGACCG 
CTCAGGCCACAGCACTTCCTTCCGATCCCCAATGAGCAGAAAATATCTCG 
CTATAAACATAGTTGGCACTAAGGGAGGGAGTGGAAGAGTGATGATGATG 
TAGATGGTGATGTAGCCCCAAGGAAGTGGAACAAGCAGAGATGGGGAGCT 
•3GAAATGCCAGGATGCTCCAGCTTTTGGGGAATTATTCAGCTCTTGAGTC 
ACTAAAGCCTTTCTCAGCTGCAAGTTCCTCTTTACCCTGTCAGGTCATTC 
TTCCAAGACAGGAGACTGACATTTATTCAAAGCAGCAAGTGCCCTGATAC 
CATCTTGTGTCTAATCATGGGCTTCGCAGCCAGTTATCAAGGTTGATCTC 
ATCTCATTGGTCTTCAATCATTTTGAACAAGAAGACAAGCAAAATAATCA 
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TGGGTTAGTTCTTATATTATTGTGTGTACATGGAGTGATGTCTGTTCTTT 

GTAGTGAGCT3TTCCTTCCTTGTTCACCCTCTTGCTTAGAACAGAACTAA 

3CAATCTGCCCCCAACATTTTCCCCAATTTCCCATCTCATTCTTGGCACT 

GG CTT C CTAAT ATTTGTTCTT ATGAGT CATTTTCTTGT AT C ATTTC CATG 

AGT C C CT CTGGGATCTTAAAGTATGAAAAATGTTGTGTGTAC CCACAC CT 

GTCTTTGTGGATATTTCTCTCCTTTCCCTTCTGCTTCTGGGATTATTTGG 

GA^GGGCACTATGATTTTTATCATATCGCTTCCACTTCCTTTATGGCAT 

CATCrCCAATGGGCTTCTTCTCCCTCTTGGATCCAGGTTCTCAGATTGGG 

GACATGCAGAGTCCAAGGAACATTCCATTCTCCTCCCTGGTCTAGAACAA 

GGAGGGCTTAGATATATGAGCAGGTGGCTGGGGCTGGCGAGCTATGTAGT 

CTCCAATGGCTTTTCCCTGATGTCGGAGTTGTTATGTCAGTTCTGGGAGA 

CCAATAAGACCTTGTCCTTCCTTTGGATCCATCAGAAAAAGCCCCTGGGT 

GGGTAAGATGGATGGCAGGGCTCTCCTACTCTATGTCTTTTCTCACACCT 

AGTGGGTATAAGAGAGGGGACCACAAACAGAGGGGGCTCTGGTACCACTT 

ATCCAGGGTCTGGAAACATTTTCTGTAAAGGGCCAGATAATAAATGTTTC 

AGGTACAACTACTCAACCTTGCATCATTTCAGAAAAGCAGTCftGATAATA 

CATAAATGAATGGGTGTGGCTGGACTTGTCCTGCGGTCCCCTGTCTTATA 

TCATTGTATTATATCATTTTTTCTTACATACAAATTTAGAAGCAATACTT 

AAAAAAAAAAAGCCGTCCTTTATTGAGCACCTACTAAGTGCCAGGTACCT 

TTTTTTCCCTCATTATCTTATTAACTCTTCATAATAACCTTTAAAGTAGA 

TAATATTGAAC CATTTGAC CTATGCAGAAACTGAGGTTGAGACAAT^AAT 

TATTTAAGACCGCACAAACAGTAAATGCTGGAACTACGACTCAAATATGG 

GTTAACTGAACCAAAACCAGATCTTTATTTCTCACTTTTAATTGTTACAT 

ATGTTTATTGCCTCATCTCCTGTCCACATGGTGCCCATCGGCAGACTCCT 

TTCTCATTCTCAGTGATTGAGTGACATTCTAAACTACATTGGCCTGGCAG 

\TTCAC CT CTGTC C C CTAAATGTTTCCACATTGTCCTTTTAGGATTGAGA 

TCCTCTCTGTTCCCTTGTCTTCCCTCCTTTCTTCTTCTGGCGGTGACGTG 

CTGTGTGAATTTGTTTCTTTCTCCTCTCAGGGTAGTACTGGGACTTTCCA 

AATCAGGGTTTTTAATGATCTCTCTrCNCTTTTCTGAATTTCTTCCTTAT 

TCCCATTCACTTTCTCATCTATAAGTGGCANCTTTGTTGCTGGAAGATAT 

CCCTTGTGCAGGGATTNCTCTTTAANAATTTGTCNNNACC 

>Contig54 

GTGATCGTCAACCTCCCACCCTGTAGGGCCTCAAGCATTGAGGACAATCA 

CTGGCTGCCCATTAACCCAGAAATGTTGCCGAGACAGGAGGCCGTGGCCC 

AAGTTC CTGGAATGGGGTATTATTATGTCAGCACAAAGGC CTTTGCACAA 

AT G AAGGC TTT AAAAAT GCAGT C CT AGT CAGGTGGAGGAGGGCTTAT AGG 

ATTCCCAGGAATCTGGATCATTCTCTTGAGAGCTTTCCCTTGTCTCTGTT 

AAAACTCACATCGTACGGCCCAAATAACAACAAAAAATGGATGTAAATTC 

^TGAAATAACTTGTGGATGGGGGAACAAGGCCCACCCCCCAGATCTGCCA 

G AAGCTT CAGGTGAGGGTC C CAAATGCCAAAAAGTCTGGTATCAGAGAGG 

ATGGC CAGTGACNTGGGGACACATGC CCTTTGCTGTGTCACTCAAGGAGC 

AGCAGCTTCGGCCCCGCACAGTGACCAGGACCCTGGCTTCCCACGCTGGG 

CAGGAGCTGGTGTCTGATGAAGGGAATGCCTGGCAGCACGTGCTGTCTGT 

^TCCTCGTGTCAGCTTACCTGGCITrGCTGCGAAGAGGCCACTT GCATTT 

CTTTATTITITATATTTTrTTAATTTTCT 

TTTTTATTTATTTATTTATTTTTAATTTTTTm 

CTTTAAGTTTTAGGGTACATGTGCACATTGTGCAGGTTAGTTACATACGC 
ATACATGCGCCATGCTGGTGCGCTGCACCCACTAACTCGTCATCTAGCAT 
TAGGTATATCTCCCAGGTTAATCCCTCCCCCCCTCCCCCCACCCCACAAC 
AGTCCCCAGAATGTGATGTTCCCCTTCCTGTGTCCATGTGATCTCATTGA 
ATTTCTTTAAAGGTGGAATCTCTCAGTGGGGTCTAATCTGTTCAGAAATA 
T CAAAAGAGTATC CTTGGGAATGACTGGAATTCCAGAGTCATCTGGTAAT 
CCTCATAAAACAACTCCTGGATGTCTCTCAGCACATCTCCCACCTTGAAC 
GCAGGAGGCTGGTTCAAATGGAGGAGCATCGCTCTACTGCACTTTTTTTT 
TTTT^TGGCCTAAAGTGCAAAAGGGGATACGTTTCATGTAAATAAATCAA 
-TGCAAATCGCTAGTTATGCTGAGCCCTGTCCCGTGCTGTGGACACAAAG 
GAACCAAAGGCITTTCTCCCCGCCCAACACACACATAACACACACACAAA 
ATCATAAAAACATACATACCCCCAACACATAACAACACACAACACACACA 
ZAAAATATATACACACAACACACACCAAACATGCCCACAAACCTGTGTCC 
AAAAATAAAT C CTACTGGTGGGTTTGTGGTCT C C CTAACTTCAAAAATGA 
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AGCC GTGGAC CTTCGCAbTGAGTGTTACAGCT CTTAAAGATGGCATGGAT 

C C AAAGAGTGAGCAGTAGCAACGTTTACTGTGAAGAGCAAAAGGACAAAG 

CTTCCACAACCCAGAAGGGGACCCCAGCAGGGTTGCTGGTTGGGGTGGCC 

AGCTTTTACTTCCTTTTGGCCCCTCCCATGTTCTGTTTCCATCCTATCAG 

AGTGCCCTTTTTTCAATCCTCCCTGTGATTGGCTACTTTTAGAATCCTGC 

TGATTGGTGCATTTTACAGAGTGCTGATTGGTGCGTTTTACAATCCCCTT 

3TAAGACAGAAAAGTTCCTGATTGGTGTGTTTTACAATCCTCTTGTAAGA 

CAGAAAAGTTCCCCAAGTCCCCACTGGACCCAGGAAGTCCACCTGGCCTC 

AC CTTTCAACTC C ATAATGGCATGAAAATACATATGTTGTACAAAACATA 

CATACACAAAGTATACATGCATCTCCCCAAATATACACATACCACAGAAA 

CATACACACAGGAACTCAGCTACCTGTCAAAAGTCTGCATGGTGATTGCC 

7 CTGCAGTGAGTAGTTAGAAAAGTGAATTTGTTTTTCAATAAATTGGAGT 

CCTTAAAAATCGTTGTAAGATAGAAAATTTTTAAAAGTATATAAAATAAA 

ATATGTATGTCCTTTGGTCTAGCATTTACACATGTAGGAATTTATCCTAG 

TGGAGTAATCAATGATATATGCAAAGATTTGGACAAGCATATTAAGCACA 

GAATTATGTATGCATATGTGTGTGTATATATATATATATCTGATACATAT 

AATAATGTAAAAGTGAAAATAACTCAGATGTTCAAAATTGAGGATTAGTT 

AGACTATGATCTGTCCATATGTGACATACAAGTTAGCTGCCCCTTATTCT 

CTCGAGCTTCAACCTCCTATAAACAGTGTCCCTTGTATATCAGTATTGGT 

ACAGATAATCGAACTTATTGAGGTTTTACATGGGGCAATAAAGGCAAGAG 

TTTATGAATACTCCATACTACACTAGGTAGCACCCCCTATTAAAGACAAA 

CTCTTCTCTCTCATTTCCCTTCCTTTCCGGAACCACTTGGTTGAATCTCT 

ACAAGTCTCTATTGCAACTGCCTCAACATGGCACCCTCCCTGCATCTCCA 

TCTTCCCTGTCCTGAGAGCAATGGCCTGCTGCCCCCACACTCACATCCTC 

ATT CATTC CAGAAGTGAGCAC CACAGAAGTGCCTACAGTTACCCCAACCA 

C CTTCTT AGAAGATAAGTTAGTGTTTGTTTTGACTTTTTAAAATTTTTAC 

TTCCTCTTTTCCTTCACAATCTCATCCCATCCCAAGAGGTTTATCAAGAA 

GTTCTCTAAAGATATGTGTCTCCTTATGGAATTTAACAGAAATCAGGGAT 

TTGTATTCTAGCCATCAAGGGAATAACATTTTTCCAGGTCTTTAGACAAA 

TAATGGAATACCTTGCAGTAATTAGATACACTATTGTAGAAAAGTATTGA 

TGAAATGGAACGATGTTTGAGATATCATATTGAGTAGAAAAGGCAAGATA 

CATTAAGTAGGAAATGTATCTTACAAAATAATTTGTCAGACACACTCCTA 

TATTTGTATGTTATATAAATGCGTATGTGAAGAAAGGCTAGAGGATGAGA 

CCACAGTCTTCGGTGAAGTTTAAGAGATGAGGCTGCAGCATGCTCAGAAA 

GGCCTGGGTTATAGTTCTTCCAGTAATTAAGGATGTGATCTTGGGTAAAT 

TGTCCATCCTCTCTAAACTGCACCACCTTTTGTCTGTAAAACAGGAAGGA 

TGGTATTTACC C CCAGGGTCATCAAAGGATTTGGTTGGAGAAAAATAAAT 

AAATGGGCTGAG C CCAGAC CTGGCACAGTGAGAGCACAGTGGTTGACTAT 

TGTGCTGGCCTGTTGTTCCTGTGTTATTGACATGCTGCTGGTGGTGGTCC 

AGAAGCTATTACCTTAATTGGTTATGTGGATTTCCCCTCATACTGAGCAG 

CTGTGTGTGGTGTTGTAAAACATAGCCATACACAGTAACTGACAAGGGCA 

AATGTGATGGAAAAATGCAAGGAAGTGCAGATAAATAGCTAATGGGCTGT 

AGAAGGAAGCTAGTCCTTGGAGGGCTTGATCAAGGAAGGTCCTTTTGCAT 

GTCACCTTTGAAGAAGAGGGGACATAGAAGAGGTATAGTGCATCCCGGAG 

TGTACCTGGAAGGGAACATGAAAAGAGGACATTTTTCTCTGGGACATGGG 

GACTCCACTTGCATGAACTCTGGAATTGGGGCAAAGAACCATCATGAGAA 

CAAGGGCTTCCTTGAACCTCCCAGGCTCATTGGCTGATCTAAACCCTGTG 

TCCCCTCTTTCCTTCACTCTCCTCTGTTTTCTATACCTGTATTATTGGAC 

TGGACTGGAAGCCACCTGATCTATCACAAGTACCTTGAAATGTGTTGAAT 

AGGTGTGGCACAGTCCTTAGCAGAGTGGCACTACCCCCACAGGAATTTGT 

TTATACCTTTGGCATGGAAAATAGCAGGAAATGAGTGATCACTGATAACT 

GAGGATGCTATTTATTATTGGCCAAAGGAATACTTGTGTTGTATTTGCAT 

AACCACTCACAAACTGTTGATTACAAATGAGTACCAGACCTAGCTCCTTC 

AAGTAAAGGATCCTGAGAACTGAAGGCAAACAGAGCTCCAGGAGTCCAAG 

ACAGAGCCACAGACCACGAGGATCCCTGGCCCAGGTAGGTGGTCCTCCTG 

CACTGGCTTTCAAGGCCAACAGGATGGATGGGGAAGTAGAGTAGCATCTG 

GCCATCTAGACCCTTGCTTTTTATCCCCACTGGAAGCACATCTGAATTTC 

TAAATATGATCTCTGAGACCTGCCCAGAACACCTTGCTCTCAGCCCCAGT 

AGCAGCCrGCTCTCTCCCAGGAGGGCTTCCACTAACAAGTAGGGCATTGC 

TGGAGGGCCAGGCAGACACTAGCTTAGGAAATCCACCAACCCTGGAAATG 
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ZTAGTCCCrrCTCTGAAGGCTCAGAAGACTGACTTTAGAGTCTAGAAAAT 

ATTGGT Z CTTGGGAACAGATTTTGAGTGCAAAGAGATGGACTTCAGATGG 

C^AGATGCACTGCTTCTTTAGGGAATTCTGTGAAAGCTCCCT3CATT7AT 

C^TAATACAGGCAGCAGATTTCATGAGTACCCCCGAGGGATGGCCCCAGG 

'CCTCCAGCCTG7GAGCATCCTTCTGTCCTTCAGCAGCACCACAGTATCT 

TTATATGTCTTTGGATACCTACGTTTCTGCCAGACATCTCTTGCTCTGAT 

GXaC^GGCTGCCAAATTCTCTGTCAAGCGCCTCCAATTTTTTGTGTCCTT 

TGATTTACCCCAACATGACAAAGGCAGTTGTGCTTCATGTATTCAGGGAT 

ACTG CC AAACCACAAACAGGTTAAAATCAAATAGCAGATATCCCTGTTCC 

-AAAGACCCATCAGCTCTACCCACCTGCTCCTGCTCACCGTCCTTATTGT 

TGAGTCCTGAAGCCCTTCCTTGTCATTTTTATTTTTTGCATGAACAATTT 

AG~7C C CTTTGTCTCACTCCTAAACCTTTCTCAAAGGATTGGATTTGTAC 

ACAAACT GC CTAT CTCTGCAAT CTTAGAAGTGATATGATT CTGAACAAAT 

CACTTAACTrrTGATTTTTTATTGGTAAGATGGGAATACCAATTTTTGCT 

CCACT^CTGTCCTATGTTGGCCTGGGCTGATGTTGAAAGCTCTCGGTCAA 




GATTGAGC CTCCTGGGACATTTGGTGGCTTAGGTTTTCATGAAAAGAGGT 
TGCAGAGCAACTGCTTTTTGTTAGGCAAAGATTAGGCTACTGCAGAGACT 




TGCCAoAX OA ILiLl UiAL 1 iVJ 1 iiwv.w»iw.ini 4w*^w>.--^ 

GCCAGAGACAATATGTAAACAAGAGCATGGCTGTG7TTCAATAAAACTT7 

ATTTAAAAAAACAGTCAGGGACCGGATTTGGCCAAAGGCCATAGTGTGCC 

AGCCCCAAGACTAGAGCAATGCACTTTTAACTTTTTTATTTTATTTTTGT 

AAAATGCCAAGATCCACAAAAATGCTATTGCACCCCGTGTGTTAGCACTG 

TGACTCAAGGTTTGGGAAATTCTGCTTTGAAGGCGTGATAGACAGGAGAG 

CATGGTCTGGCCCCTTGGTGCCTTTCTGGTTGCAGCGAGCATTTCAAACT 

ACAGAGCAAGGCCAGTGGTCTGTTCAGCACTAGAGACATGCAGCAAGGTG 

TCCTGGGGTGAGAAGATGCCATAACTGGTCCCCTTTCTATCTCCTTAGGT 

CtTGGACTTCATTCCATTTTCTGTTGAGTAATAAACTCAACGTTGAAAAT 

GTCCTTTGTGGGGGAGAACTCAGGAGTGAAAATGGGCTCTGAGGACTGGG 

AAAAAGATGAACCCCAGTGCTGCTTAGAAGGTAAGGTTCTTGTAGAAATC 

TACCTCAGGGCCAAAGTGTAATTCCTAGAGCAGAACTTTGCTAGGTGCTG 

TGCACAGACCCAGTTGTTTCCTGCTGACTTGCACAGTAAGTGAGCTTTCA 




AAGC - GCTTCAGTGTtLAliviv-ACAi v^vjvj irtvj i««ni«inn»>«.— - 
CCTCACCTACAGATTTGGGGCTTTAGCATCCTGTTTGCCAACTGGATGGT 
-3CATA-GCTTCAAAATGCACCTCTTCCCTCCCAACATTCCCAAGTGGAA 

Sagaagc^ccgatgagaaggaactctctaaggctgggctgaacaaatga 

^CCAGGCACAGGGCATCTGAGTATTCCATGAGGAACACATTTGGGTGTTG 

ccc^tgggggacaataggaggaggcttttgacccaaatgattgtctactg 




AGTT CAT CT CTAT TLiATGTi i u 1 UAAVaMWJV»««vjv. ™- - - - - 

attactatcatgctgctctagtatttctcagcacatacacagaagaggga 

ATTAAATGGTCCTTGATACCCCTAAATCCTTGGAAAATCCGAATTGCATA 

t gctaacc^cactgcgtctgactgcagacccggctgtaagccccctggaa 
ccaggcccaagcctccccgccatgaattttgttcacacaagtaaggcctc 

GGGGTGAGGTGATGGGGGTGGCTGAGGTGCGAGGGTGGGGATGGGGGATG 
GAGCCATTGGGTCCTCTTACAGGGTGAGAGAATTGTAGAATGGGGACACC 
— "lGGGTGCTGGATGGGGCTGAAGTCTTTCCTTTGTGGAAGCAAATCCCA 
«... - ---«f«M«^« * */^*^r<A(^^^r(^r^^arvvV5C^flGTGATGCT 




CACCTGCTAAGAGGCAAAGC^caauuahvjao uioi w j. vjwnnvw * * - 
CAGGTGCC^CTTCTGACCATAGCCAAGAGACTGGAGACACAGACCTCCTC 
CCAGCACTGAGGACAAACAGCCATGGGGCCAGTGGGGGTGCAGGGACACC 
CACACCACTAAGGGCTCAGGGCGGCGCCTTCAGAGCCTGAACCTTCCTCT 




TAGGTGACACAGGTAATAAGTGACAGATGTGGAAATTTAAAGGTA^ 

^cgtc~--tctgcctgactcaggcttaaggctcccatcacctcctcttctc 

AGGACAGAGTCAGGAGGCCTCAGCCTGAGCCCCAGCTCTAGTGCAGGTTw 
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ATGTGGGAATACTGAGCX rCACTAGTAL^ATGGCAGAGAGGACCAAATGG 

GACCAGG7G7G7AAGGG7GCC7GGCACAG77GGGGGAGGC7GC7G7CGC7 

TCTCCACCGCTGCTGCTGCAGTTACCTTTGATGTTTTAGTTTTGTTGTAG 

77ACACCA77GC7GGC777GGA7C7GCAC7G7G7CCAC7CCAGG7GGAAC 

CACGCACACAAGCC7C7C7G7CGGGCC7G7CC7GAC77C7CC77G7CAGG 

GC7GGGA7C7CCT7CAAATC7GGCGGAAG7GG77C7CCAAG7C7GG7CC7 

CAAAC G T CAGCAG CA7CAGCGCC7AGAAG7G77AGGAA7ACACA77C CCA 

GGCCCCACCACAGACCTCCTGCCTCAGAAACTCAGGGCGCTGAGGCTCTA 

GGGGCTGCTTTAACAAGCCTTCCAGGTTATCGTGACGCACCTTGAAAGTC 

TGAGAGCTACTGCC CTACAGAAAGTTACTAGTGCCCTAAAGCTGGCGCTG 

GCACTGATGTTACTGCTGCTGTTGGAGTACAACTTCCCTATAGAAAACAA 

CTGCCAGCACCTTAAGACCACTCACACCTTCAGAGTGGCCTTGAGAAAGA 

TTTGGGGTCAAGGATCATGAGCGAGAACACCACTTAAGAGGATAGTGAAC 

TAGTCTGCATGTGAGACGCTGAGATCCTATGTCAGGCTGTGATAGGAGGG 

AAACAGAAACCAAAGGAAAGAACAGCTTTAAGAAGCGCTTAAGAGGTACA 

AAGTAAAATGATGGTGCTAGAAAAGTAGCTTCTTAAAAAGAGCATTTTCC 

AGTCTCACCCTGGACTAACTGAATGAGAATCTCAGGAGTGTGAGGCCCAG 

GTATCCATGGTCTTAAAATGCCACCCACCAGGTGATTCCCAGTGTGCACC 

AGGGGTGAGAGTCACAGCCTTAGGCCATGCCACTCAAAGGGTGTCTTCAG 

AC CAGC AGCAC CCACAGCT CTGGGAGTGCATCAGAAAGACAGAGGCTTGG 

C AC CAC C CACACCTACTGAACCATAGTTTGCAGGTGATTTCTTGCACATT 

AAAGTGTGGGAAATGGAAAAGCTTAGAGTTCAGCTAGCTCGGTGACTCTC 

AGTCAACCTGCACCTGCTCCATGAACTCAGACTGCCTGGGATGGGCCCAG 

AAAAGCTCCTGAGGAGATTCTGATGTAAGGCAGGGCTGATAACCATGGAT 

CTCATCTGACCCCATATCACTGGGGAGT7ACTTAGGATCTTGCCTGGGGC 

CAGTCATCTCTTCCATAGACACTGAGAGTGTCCACGATGCTTGGGGCACT 

ACAGGGTGGGAGGTGGAGGATCACGGGTGAGTCAGATAGGAAGCCTGCTC 

CTGGGGAGCTTACAGTGCTATAGGGCAGCAAGCCAAGGATGCCAATACCT 

GTGTGCAGGTACCACTGACGAGTGCAGAGCGCTGCAGCACCAGAGAGGAA 

GCTACCCTGTGCAGAGGGGGCTGAGGAGGGCTGCAGGGAGATGACAGGAA 

AGCCGGTGTTACAGGAGGAGTCCTCCCCACTCTTTGGGCATGAGGAGACC 

AGGAGGACATTCTACAGTGAGAAACCCAGGCAGAGGCCATGTGCTTATGG 

CATGGGAAAAGAATGACACCTTAGACTTATTCTCTACATTAGAATTGCCT 

ACCACAGATACCCATATTATAGCTTCACATAGTGTGGTGGTTACTGTGTT 

TTCATATTGTCACATTTGCCATTTTCCAGCCACCCACCCATTCTTGACAG 

TCACTGGCCCAGCCTGGGGGCCCCTGTTCTTTATCAAACAAGTGCCTGAG 

C7C777GCAGAGG7GAGGG7CACC7G7CCAATCAGAGGCCAGGAGGGAAC 

GTTCCCTTTTAAGACCCTACTCTAGGCAGGCCTGGCCCAAATGAGTTGCT 

AGGAGCCCACGCCCTAAGAACCCTCTGAGCACTGTTGTGGCTGGTCCTGC 

TGCTAGAAGTTGTTCCTCCAGGGCC^GGTGCAAGATTTGTGGCTTTTCAA 

AGGAGCCACTAAAGCTCCAGCTCAGCCTTGCACGGTGCTGGGCTCCTGGG 

GGCTTCCTGCCTCCAACCCTCCC^CTCTTCCATCACCGCTCCCTTAGCC 

7GGCCAGTGCAGGGATCTGTTCCACTCTAGGCACTGCTGAGGGAATGATG 

CCTCCAG TCAGAGG GTGCAAAAAAGAGAGTTAAGAAAAACAATGATTATA 

AAAAGTCCTTTTTATACGCCAGACATTTTCTTTGCTCAGGCTAAGTGCTA 

CTTATTTGAGTAAGCATTTTAGTTCTCATAACTCCTCTCTCAAGTAGGTG 

CTGCTATTACTTTCATTTCACAGATGAGGACATTGAGGTTTGGAGAGACr 

TAGTAACTTGTCCTCTGTCCTACAGCAGAGCTGGGATTTGAATCTATCTG 

7 C CAAAT CTGGAACC CATTTGCTTGCACAGAAAGCTT AATTGCTTGTCCC 

AGCAAGA7AGAAAGCCTGGGAGTGGAAGAAATATTCAGTGGCTGTGATGT 

C7GAGCCCACAGGCAGGG7GGAGAGCTAGGGC7GGGGCCCT7GGACGTGG 

GGAAGAAAGGGC7GAGTCTTCCATT7TCAATG7GAAGTGT7GATATC7GG 

7GA7A77GA7CTAGGTCCAAAGGTGAAGAACTTAAACCCGAAGAAA7TCA 

GCA77CA7GACCAGGATCACAAAGTACTGG7CCTGGACTCTGGGAATCTC 

A7AGCAG77CCAGATAAAAAC7ACATACGCCCAGGTGAC7CTCAGTTTTG 

GC7G7G7777C7GCC7CCACC7AGCAGGGG7AAGGCCTCC7GCTAGGTGG 

GC7CAAC7CCA7GC7ATACCA7GCCCCATC7CCAGCAGG7GG7GGAAGCG 

AGGAGGAGAGGCCCCAGGGAC7AGGGCA7CAGA7GAAGGGTC7C7AGCAA 

7GACCAGA7C7GAAAG7AGTC777CTGGAAGGGC7GGAGAAAAAGAAGGA 

GGC AGACAC77AGAC7GGAAGAAGAGGAGGCT7AAAC CGG7G7GA7GGAG 
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GGAGAAG7GGACCACAGAviTCAAGGGAGAGGGACTGTGCATCAGGCCTGA 

AACCCCAGCAGACAGGAGAGACCTTTCCCTGCTCTCAGAACCCACACATG 

77 C7 G A CTGT CTTTTT C CAGAGAT CTTCTTTGCATT AGCCT CATC CTTGA 

GCTCAGCC7CTGCGGAGAAAGGAAGTCCGATTC7CCTGGGGGTCTCTAAA 

GGGGAG777TGTCTCTACTGTGACAAGGATAAAGGACAAAGTCATCCATC 

CC7TCAGCTGAAGGTGAGAGTTCTAGCTCAGTTTCCTGGGCCTTTGGCTA 

CCCCAAAGTAAAAGGCCAAGATCCTCAATGCCTCTCGCTTTCCTGCAAAT 

7CTTATCTTGGCCAATATAACAGGGACATCCACCTTTCTGGAAGCACCAG 

GCAGAAGAGCCCCATAACTTCTTCTCTGGTTCCTTGCCCCTTCTAGGGAA 

GGAGGAGAGAC7CCTCACAGCGGGGAGACAGCAAGGAGCTGAGCACCTGT 

7C7CCTCTCCTGGGCTCACTGGTCCTGGCCCTGGGCGGGTGGCGGTCCCC 

TCCTGCTGTGGCCCTCCATGTGGCAAGCAACACAATTGGGCCAGGACCCT 

GGCGTGC7GCTGTAGGGTAGGAGGGTGTGAGGGAGCACTCGGAGGGCAGT 

G7GTC7GCCCTGCAAATTTAGTCCTGGATGGAGCATCCTTTCACTTGAGG 

GGAGAAA7C7TAGGAAGCTGAATTAGATACAGATCTAAGCCATATTCTCT 

AATT77AAAAACTATAGAGCTGAGATTTTGGTATCCATCTGACTCTTACG 

TCTCTCTCTCTCTCTCTCTCTCTCTCAGTTTATTTTTAATCTGGGGGACA 

AGAAGGCCTGGAAAAGAGGGCATGATTGC7TATCATCCCTTAAATACCAG 

7ACCAAGGCTGACACGTCATCTTTCCCAAGGACCATCTGCCTTCTCTCTT 

77CC7CCTCTCCTGTGTAAAGGCCTGGAGGATGAGCACATGTGCTGTGTT 

7TCCTCCC7CTCAAAGCCTGTGCTATCTAATTAATCCCTTTTACCTCACA 

GAAGGAGAAACTGATGAAGCTGGCTGCCCAAAAGGAATCAGCACGCCGGC 

CCTTCA7C7TTTATAGGGCTCAGGTGGGCTCCTGGAACATGCTGGAGTCG 

GCGGC7CACCCCGGATGGTTCATCTGCACCTCCTGCAATTGTAATGAGCC 

TGTTGGGGTGACAGATAAATTTGAGAACAGGAAACACATTGAATT7TCAT 

TTCAACCAGTTTGCAAAGCTGAAATGAGCCCCAGTGAGGTCAGCGATTAG 

GAAACTGCCCCATTGAACGCCTTCCTCGCTAATTTGAACTAATTGTATAA 

AAACACCAAACCTGCTCACTAAACTTTCTGTCATTGGGTTTCATTTCTCA 

7TCATGCTTTAAGGATTTGTGTTTTTAGGATATAGCAAGAAGCTTGTTTA 

ATTACAAAGTTCTGGGTTGGAAAGAGACCGGCTTCTGCTTGTGTACTGCT 

ACCCTGAACCATCAGACATGCATGTGTGTGTCATATGCTATGATGTGGCC 

AGTCTGAGTGCAATACTTGCAGCGGGAAGGAGCAGCTGGGTGCATGCTGT 

GCTCTAGAATTAGTCTTTCCTACTGGGGTTTGGTAGATTCTGAGGGCATT 

GATCCTGGGGCAGAAGTGGCTGAGTCTGTGTCTAGGGTACAGTGTGCAAG 

AAAGAAATGTAACAGCAAGTCACAATCCAGCCAAGTGATAGTGGAAAAGG 

GGTAGTTAGG7CCCAGATAAGGAGCAGGGTGACTTGACCTGTGGGAAAGG 

CACAGAGACAAGGAATCTGGGTCAGATGACAGCCAGGAGACCAGGTGAGG 

GAGGAGCCAGGTACTGTCTGGGAGGCTTGTCAACAAGGGCATGGTCC7AT 

CAC7AAGCAGGGCTCAGATCCTCATAATGGGGGAGTGGAAGGCTGGCCGA 

ACAGAAA7CAGGGCCTGGAAACAGAGTGAGGGGGTGGAGACAGGAGACTG 

AGGC7TGGAAATTAGTTTATTAGTTTTAGCTCTTCAGTTACAAGCA ATAA 

7AATAGCTTCTAGCTTATTTAAGCAACAAGTATACTACAAAAGGAGCTTT 

C7AGAAGGATAT7GGGTATATTCATT7CTTAC7GCTGC7GTAACAAATTA 

CCACCAACTTAGTGGTTTAAACAATGCAATGTATTATCTTGCAGTTATGG 

AGGTCAGTCTGGAATGTGTCTCACTGGGCCAAAATCAAAGTATCAGCAGG 

ATAGCATTGCTTTGGGAGGCTCTAGGGGAGAGTCAATTTCCTTGCCTTTT 

CCAGCTTCCAGAGGCCACCTGCATTCCTTGGCTAGTGGCCCACTCCCATC 

TTCGC7GCTTGGGTTTTTCTCACACTGCTTTGCTCTGACCCTCCTGCCTT 

CCTC777CACATATAAGAACGCTTGCAATTTACATCGGGCTCACGTCAAT 

ATCCAGGATACTCTCCCGTCTCAAAGAGGCTTAACTTTAATCACAGATGC 

AAAGTCCCTTTTGCTATGTCATGTAACATATACACAGGGTCTGGGGATTA 

GAATGTGGACATTTTCGGGGTGCCATTATTCTGCCTATCATGTGAAGTAA 

CTTTC^AATGGAAAGACATGCTGAAGAAAAAGTCAGGGATTTCTGGCAG 

GCCAGAAA7GACAGAAGGCAGAAAACGTTGGTCCCATCACTCAGATGGGT 

AAGAGCCAATCATGCTTTTTGTCAGTTAGCAAAAGATTGAGATTCCAAGC 

AAAGCATGCAACTGCCCTAGTTTGGGTCATGTGTCGACTCCTTGGTCAG7 

GAAGGGCAGCACACCTTGATCAATACTCCCTCCAAGACTGTATCCAACGA 

GGCCAG7GATGTTCCTCAAAGCAGAGCTAGAGAGCTAATCCCAGGAGAGA 

GGCG7G7GGGTGG7GGGCAGGAAGACAAAGCTCAGCCGTAAAGGAGTAGT 

AGGGACAGCACCC7AGGCATGGAGGCTCAAGTGAGATGATACCCATGGGA 
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.^GCTCTGATAAGGTC^viCTCCTTCTGTTTCTGATCCTGATGGTGATGU' 

7GA7CAACACCAGCCCAG7GACAAAAAAG7ACA7AG7AXA777AG7AGA7 

GTTTCCCACACAGAGAAATGGTAAATATTCAAGGCGAGGAATACTCCAAA 

GA7CC7ACC77GA7CA7TACACA77CCG7GCA7G7AA7GAG7AC77GCA7 

GTATG C C A7AAA7A7G7GAAAXA77A7G7A7 CACTATATAAAAGAAAAAA 

AAATGTGGCCAGGTGACATCCATATTTTGGAGAGGAAGGCATGTCTTCTT 

CA7AA7A7CACAAAAC7ATTT7CACAACAAAGACACAGC7G77CAAA77A 

GTCTCTGAGCCGGGGCTGTCTCATGGCAGTGAGGACTCTGGTTCCCTTAC 

AGAC7AGCAGAAAGGAGA7GGGGC77AC7GACCA7GGCC7TGAGGAGGC7 

GAACATGCAGGCCAAATGGAGACACAGACAGCCTGGGCTTGGTCCTGCTC 

CATCCCCTTCCAACC7GATGAGATATAGTGAGTCACTATGACGTGGGTCA 

CTCATGCTTCCTGTGAGGCTCCACCAAGACAGCAAGTGCATCAACACCTT 

ACGGAAGCACAAGGCCCTGTTTGTTGTTGACTTCATGAAAGGCATGGTTG 

TGGTGATCGCATTGAGTAGGCTTTTGGGTGAGAGGTGAAAAACCCCAACT 

ATCATGCATTGCAGCCCTCTGGTGGAAACTGTGCTTCAGGCTCTAAATTT 

CAGGCTCTAGACTGACTCCAGGATGAGTATTTGGAAGCTGAAfiTCAATCT 

GTGGTCTCTTCTCCTGTAGAGCAGGAGTCAGCACTTTTCATAGAGTGCCA 

GATTCTATATATCCTGCCACATGCTCTGTTGTTACAGAACAAAGAAGGCC 

ATAGACAGCATGGCTGTGTTGGCAAATACACAAAACAGGCAATAAGCTGT 

ATTTGGCCTTTAGGCTGCAGTTTGCCAACCCCTGCACTAACACAGAGCTT 

AAAGG7GGTGGTGGTGTGCTGGAGCTAGCTTATATCAGCTTGCAATAGCC 

AATTGCTAACATCTCTTCCAAACTCTGTGTCTGTGCCTTGATGTTGATAG 

777GAAA77GGC7ACCCCATT7AA7GC7GCAA7C7777C7CACCCCAGCA 

CTAC7GACTCCCCTTTGCCCTGTCTTATTTTTCTCACTCTAACATGCTGT 

A7 AG7777C 77 C77ACAT77A7TG777G7G7C77C CAC7AGCA7G7A7G7 

CCCACAAG77C777GC7CTGTGA7GTATCCCAAGAACCCAC7GCAG7GC7 

7GGCAC77G7AGGAAC7CCATAAGATTTTTATAAATGAAGAAAGGAAGAA 

AAAAGAGAGGGAGGGAAAAAGGAAAGGAAGCC7TC7ATTTAAATGATGGC 

CTTC7CCA7ATTTCTATAGTAATATGACTTCCCTTGCAAAGGGGGATGCA 

777TGGAAAA7GTGTATAAATAAAC7CAGG7GG7T7TGAAT7TCA7TTTC 

C7AAC7G7AA77G7AA7CATTGG7C777A7G777AG7GAAAAAG7TT7GG 

CCCT7A7GCC7CACACCTGAGAA7CCCAAAGTA7TGG7TTGTTAGAGCTC 

CCATAGAGAACCAXAAACTGGGTGGCTTAAAACAACAGAAATGTATCGTC 

7CC7GG77CAGGAGGCCAAAGTC7GAACTCCAGGTGTTGG7TCATTCTGA 

GAGC7C7GAGAGAGAATCTGTTCCAGGCT7CCC7TCAGTTTGTGGTAGCT 

CCAGGG77CC77GGC7GGTGGCAGCAAAACTCCAGTC7CTGCCCCCATCT 

7CACA7GAC7G7C77CTCTC7G7G77TCTGTG7CCAGAXTG7CC7A7AAG 

GACAGAG7CA7AC7GAA77AGGGC7CAC7CGAAXGACT7CATC77AAGT7 

GAAC7G7A7C7G7AAAGACCT7AT77CCAAGTAAGG7CACA77CACAGC7 

AC7GGGGGA7AGGACCTCAACATA7CT7T7TGGGGGACAXAATTCAACTC 

A7AA7ACCCAACA7GATAACTGTTCATCCCATGAAATTTAATGTCTCTCA 

AAAGG7GA7C7CAGGGCATTTAATCTGTGACAGAAACTCCCA7AGGAAAC 

ATTCCAACCAGAAGCTCCTTTCACAGCTGGTCACTCCTCCTACCCCATCC 

GAGG7CC7GGGGCAGGGTGAGGCAGGTGGGGACAAGAAGAAGGCTGTC7C 

GGGTG7AGAAAGAGAAGACCCTTATTCACCCGGCACTCTGTTCATGAATG 

AGCTA7CCAGCATAGGATATAATAAATCGC7TTAGGAGTGGTAGACTCCA 

AACA7777TTTGGTCCCAGTTATCCTAA7CAATTAAACAAACTCTAGAAC 

CCATC77GAAG7GCAGGCATTGGGACAT7A7GAAACTTACACAGAATTCA 

AAAA777ACAAGGGCTAAATAAAACAGGGTCTGACATCTAATATT7TCTT 

CCCACA77CCCATGCACTGTCTGGCTCAACCATCCCCAACCCTCACTCTC 

A7CC7GG7GGACACATGCCTAGTGATGTGATCAGCTGGTTCACAGGGGGC 

7GG7GA7GG7GGATA7ACAGC7TTTGCCAATTTCCA7GGCA7AACTACTC 

CAAA7A7GGCCAA7TTCAAACTACCAACATGAAGGCACAGACACAGAGTT 

7GGAAGAGAXGTTAGCAATTGGCTATTGCAAGC7GATATAAGCTAGCTCC 

AGCACAGCACCACCGC7ACCTTTAAGC7CCT7G7GTTAGTGCAAGGGTTG 

GCAAAC7GCAGCC7AAAGGCCAAA7ACAGCTTAC7GCCTGTTTTGXGTA7 

77GCCAACACAGCCA7GC7G7C7A7GGCC77C777GTTC7GTAACAACAG 

AGCATG7GGCAGGA7ATATAGAATCTGGCAG7C777AATAAGTGC7GAC7 

CC7GC7C7ACAGGAGAACACAGATTG7C77CAGCT7CCAAACA77CATC7 

C7GAG7CAG7C7AGAGCC7GAAA7TTAGAC7GAAGCACAG7TTCCACCAG 
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AGGGCTGCAATGCATGA^AGTTGGGGTTrTCACCTCICACCCAAAAGCCT 

ACTCAATTTTTTACTGCAAAAACATGTTATCATCATTATTTTTTACTTAG 

■~CCAC CTTTCCTTGGCAATTTTCCATAGGAAAATGCATTCTAAATTTCAA 

CTAATCAGGGGACTTGGAGCCTCTGGACACCCCCTTGTTCCTTGCCCACA 

3TCCCTTGCAGAAGGTGCCTTATCAGAGCGGCTCCATGCAGGGGCTCAGG 

ACAGGA7CAGATGTCAGTTGCACCAAGGGGGCAGGGACAGATCCTCTCTG 

'TfiACCATGCAGAAGGGACTGTTCAGTGCACCGTCATGGTCCTGGTGATT 

Tr TGGTCCATAAGGGAATTTTCACATGCATCGGGTGATTGTCACATCAGC 

ACAACACTGTGAGGAAGGCAGAGTGAGAATTTGTGTGCCCATTTTATAGG 

TGAGAAAACAGATGCAGAGACATTAAGTAACTTCACCACAGTCATGCGGG 

TTTTAAGTGGCAGACTTTCAGGTGT7GTGACTCCTAGTCCAGAGTTCTTT 

GCACTGCCCCTGAGGTGCTAAAACTCTACTGTGCTTTAAGACTCACTTGG 

GGAGCTTCCTAAAAAGAGAGATTGCACAACCTGAGATTCTTGTTTAACTG 

TrrTGGGATGTAGCTCAGGGATCTAGCTGCCTTAAAAAAAAAAACTCCCA 

AGTAAT T CTGATGCAAGCGGTTCTTTTTTGTCCACCTTTGAAGAAACACT 

GCCTCCTCCCCATACATTTCATTAGAAAATGGTAACATGTTTTTCAGCCT 




TGAGTCTGAACCATGGAAAAATTCCATTGTGTGGCACATCTGGTGGGTGT 

GCACTGTCCCCAGCAGTGAGGCACCCAGTGAAGACAGCAGCTGGGAGAGG 

CTTAGTTACATGCAGTGGGACAGTGTGGGCTAGACTGCTGAGCCCtCTGC 

AGTTTACT CTGTGT CAGGCAATGAGGGTGAAAGGCTGATCAGAC CCACGT 

GCAGACCATACCCTCCAGGGAGACAGATATCAGTCAGGACAACCCCAAGT 

GTAGCTGGAGAAGCAGTGCCCAGGTATGACCGGATGTGTATCCAACCAGG 

AAATCTGCATATAAATATAAGAGGAGAAAATGAACAGATGTTGCTCTTAT 

ATGTAGATATTTATGAAGAGCATATAATTTTGTTTTGTGTGTITTAAGAA 

GTTTATAAGTATGCCTTAAAAATGTATAGTATATACTGTAGGTATTTTTT 

rr A.TT AGAT ATTTTGTTTTTCATACTT ATC CACATTGACATTGTA GCAA C 

AGTATAATATAACAACCTCCTCTACAAAAGCAGAAGGAAGTGAAGCTTTG 

GAAGGAAGCAC C CAGTGAGCTTGC CC CTTT CAGGTGGGTGCAGTGAGCAG 

GAGTCAGTGAGGTTGAGATCCTTTGAGAGGAGGCAATCATTAACCAGGAA 

ATCTGCACTGCATCCTGGCCACACCTAACCCTTGGACAATGGTGCTTGGA 

GCGCCTTCCAGCTCTTAAGGCTTGCGATTTCTTTCTCTCACTCTTCACCC 




GGTGTCT"~rGAAAGCTGAACAAGTTAGTAAATTTCAAACCTCATTTCACC 
CAC CAGTAAAATGGGAATAATAATAAACCTATTTTACATAGGGTTGACAA 
3AGGAGTAAAGAGGGATTCAATGAAAGTTCGTTATTATCATTTGTAGTAG 
CAGTGT-GATAATATCAACTGAAAGTTCATTATCATTATTAGTAGCAGTA 
■"TGATAACCCTCTTTTCTGTGCCTTCTCACTGGTGGGCCCAGGCCATCAG 
CAATGCCCAGGGTGTCATGGATCTCTGCTGCATCGGGCACCAGCTGTGTC 




GCGGGTGTGCCTCCTCTTGTGGAGCCTTTCTGCGTGGGTAGGAACTGCTG 

CAGCTGTGGCCATGGATTCACCTGAATATGGG7GGAATTAGGCATTCAGC 

T GGGTT AGCTGTGC CT AGAAGGAGGAACTCTAAACTGAGAACTTGTC CCT 

ATTGCCACCTCTGATAGGCAGATGATCCATCCATCAGTGGCTGAGCTGAG 

GTGTGCATGGGGATGGGTAAGAGCCCACACACAGGGCTGATGACTGAGTC 

TATTTAGAACAATAGATGTAAAATCTGATAATGTAAAATGTGATAGATTA 

TTTTGTCAATTAGAAATGGTACCATATAATTATATATATACATAAACATG 

TATACATATACACACATATACATGTGTGTATAAACACACACAGTATTGTC 

CCCTACTCATTCCATAAACCTGATGCCTTTAGCTGGGATTCCCAGCTTTC 

A.CTCTCCTCTCTGTCATCTGCTGTCTATATCCTCCCCATCCTGTAATTCT 

GGCrTATATGCCACTTCCTCCCTAAAGCCCTCCCTCAATCCCTTGCTGGA 

AGTGACATTTTCCTCTTTGAGCTGCCCCTGCTTGTGCTTTGGTGAGGTCA 

GCTGTATTGCAGTACCTTGTATTGTGGTTGTCACATCATCGTATAGAATT 

AATT^C^GACACATTCCGTATTTTTCAAAGGGCCTAGTGTGGGGCTTTAA 

GAGTAACTACGCCACCACGCCCAGTTAATTTTTTGTATTTTTG^ 
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CTGCACCCAGCCACCTATCAAAATTrTAAGTGCCATTTTTATTTTTTATT 

TTTTGTAGAAATGGACAAGCTGATCGCAAAATTCAGATGGAATTGCAGGA 

GGTTC CAAATAGC CAAAACAATCTTGAAAAAGAAGAACAAAGTTGGAGGA 

TTTAC ACTTT C CAGTTTCAAGACTTAGCTCTTAGCTACAAAGCTACAGTA 

AT CAGAAC ACTATGGTCCTGGCATAAGTGATGCTGGACAGGTGAGC CC CA 

AAGTGGGACTTAACCTGTGAAGGTTCTTGGCCTTGCCCAGGAAGGAATTC 

AAGGG CAAG C CAATGGGACAAGAAAACAGCTTTATTGAAGGGGCAGTATT 

ACAGCTCCAGCCCTGTTACAGCTCCAGCCCTGTTACAACTCTGACTACTC 

CTGCACAGAAGGGCTACCCTGTAGGCAGAGAGTAGCAACTCAGGGCAGTT 

TTGCAGTCATTTATATCCACTTTTAACACATGCAGATTAAGGGACAATTT 

ATGCAGAAATTTCTACGGAATTGGTAATAACTTTTGGGTCATGGAGTCAT 

C ATGGAAGGGGGGCGGGGAACTCC CTGGTGTTGC CATGATGACGGTAAAC 

TGATATGGCGAACTGGTGGGTATGTCACATGAAAAGCTCCTTCCACCCCA 

GCCCTGTTTCAATTAGTCCTCGGTTTGGTCCAGTGTCCAAGTCCTGCCTC 

CAGAGTCAAGTCCCACCCCCTACCTCTTAAGGAGAGATGTAAATACATGG 

AATAGAATTGAGAGTCCAGAAATAATCTCATACATCTATGATCAA TTGAT 

TTTCAGCAAAGGTGCCAAGACCATTCAATGAGGGAAAGAATCATATTTTT 

TTCAACAAATGGTGCTGGATAACCACATGTGAAAGAATGCAACTGGGCCC 

TTATCTCACACCATATACAGAAATTAACTCAAAATGGCTCAAACACTTAC 

ATGTAAGAGCTAAAACTATAATATTCTTAGAAGAAAACAGGGATATATCT 

TTATGACCTTGGATTTGCTGGCTGATTCTTAAATGACACTGAA AGCA CAA 

GCAACAAAAGAAAAAAAAATAGGTAAATTGGACCTCATCAAAATTTAAAA 

CTTTTATGCTGGGTGCACACCTGTAATCCCAGCACTTTGGGAGGCTGAGG 

CAGGAGGATCTCTTGAGCCCAAGAAGCTGAGGCTACAGTGAGCCGAAATT 

GTGCCACTGCACTCCAGCCTGGGTGACAGAGCAAGACCCTGTCTCGAATA 

AATAAATAAACAAATATATAATTATAGATCTCTGGATCTTGCCTTCGGAG 

ACTGACTCAACTAACTGGTCTGGGTGGGAGCCCAGCCATTTGTATTTTTT 

GAAAACTCTCCAAATGATTTTACTGTGCAGCCAAGGTTGAGAATCACTGT 

ATCATAGGGTTGGACTCCTAACTGGAAACAGTTTG CACCA TCAGGTGTCG 

CAGCATTCTGATAATAGTTAAGCTTTCCTCCTAGATTrrCTGATATTAGA 

TGAGTCATGTTTACAAGTTTTTACCAAGAGACAAACTATCTTTCTGCCCT 

TACTTTCTCTCTTATACTATTCTAATCCCAGAACCCTTTGGAACTTCCAC 

TGAGAGATGAATCTAGAAAGTGACTCTCTTGGCTACAACAGAGAGTAATG 

TTGGCCTGTTTGTGCCAGATCCAGTTGGTGCTGGTGGTGGGACAGCACCT 

CCCTGAAATCCCCTCCTCTCCCGTCAGATTCAGTCCCCCATTTGCATCAC 

GTACAATCATCACTATGGGTTTCTATTACCTTGCTAGGGCATTTGGAGGT 

ACCATATATACCAACTATTAGTTTTGAGCCATGGTTCCCAAAGTGTGGAC 

"G T AGGGCACCTCAGCACACTCACGAGGTGTCATGGGATATTTAAATATT 

CTGAAGAAAACACAGTGACATCTGTCAGGCCCGTGAAAACCGTTGGCATT 

AAATTGTCTCAACCCAATTGCTTAAGAAGCAGAACTGGCCAGGCACGGTG 

GCTCACATCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCAGATCAC 

GAGGTCAGGAGTTCGAGACCAGCCTGACCAACATAGTGAAACCCCGTCTC 

TACTAAAAATATAAAAATTAGCCATGCATGGTGGCATGCACCTGTAACCC 

CAGCTACTCAGGAGGCTGAGGCAGGAGAATTGCTTGAACCTGGGAAGCGG 

AGGTTGTAGTGAGCCAAAATCGTGCCACTGCACTCCAGCTTGGGTGATAG 

TGAGACTACATCTCAAAAAAAAAAAAATGAGAGAGAGAGAGAGAAGCAGA 

ACCATCAGGTGTTTCTTTTGGCTTAAAGTACTCrrGTGAAGAAATTCC^ 

GACACGAAGGATACCATGAACTGAGAGATTTTGGGAACCTCTGCTTTAGA 

AGCTGGAGGTAGCATTCCTTGGGCACAGTACTGCCTTGGGATCAGCAAAT 

CCTTTTGATGGTGCATTTAGGTGTGGCAAGACAGCTCTTAGAGTGGGACC 

GGGATGTGCTTGGAGACAGAGGGAACTAGATTGAGCTGCCCGATAAAGAC 

ATGCCAGCCTGGCAGAGTGTAGTGACTCATGTCTGTAATCCTAGTGCTTT 

GGGAGGCTGAAGTGGGAGGATTGCTTGAGGCCAGGGGTTTGAGATCAGCC 

TGGGAAACAACAAGACCTCTACAAAAAAAAAAGAAAAAAAAAATTAACCA 

CATGTGGTGGCATGCACCTGTAGTCCCAGCTACCTGGCAGGCTGAGGTAG 

GAGGATCACTTGAGCCCAGGAAGGTAAGGATACATTGAGCCATGACTGTG 

CCACTGCACTCTAGCCTGGGTGACAGAAAGAGACTCTGTCTCAGAAATAA 

ATTAAATAAATAAATAAATATATAGTGGCCATGACATCCCTAGAAAGACA 

AGGTCCTGGGAATAGGTAGAAGCCAAGGGAAATGAGAAATGAGAGGGGGC 

CCTGGAGCTGGAACTGGGGGAGCAGGATGGCCTCTGAGAAGTTCCTGATA 
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GTGGTGTCACTGATGTGTCTGATGTTTAGTTGTAATTATTTGC7GGGCCC 

CTGTCATCCCTCATATCTGATAGCTCTTTGCTAGTCAAAGTGTGGTCTGG 

GGA7CAGCGGCA7CAGCA7CAC77GAGAAC77G77AGAGA7GCAGAA7C7 

AGAGCCC CACC CGGGACCCAGAAACAGAGCCTGCATTTTAACAAGCTCCC 

CAGGTGATTCTCACACACACTCGCATTTGAGAAGCACTGGGCTAGTTGAC 

AGA77 C7 CAGGCA7GGC7GACA77GAAA7A7CCAGGGAGCAGGC77GGCA 

77AGGA7G777AAAAG7CC7CCAGG7G777C7AAAGCCAGG777GAGGAA 

TTACTGGGCTGATACAAATGTTTTGTGATGATGCTTTGTGTGTGTGTGTG 

7G7G7G7G7G7G7G7G7G7G7G7G7G7G7G7G7G7G7G7G7AGGGAA77C 

TGGGT C ACTTGG C ACCAACACAGGAAACAATGGAAATATGTGAGC CATGA 

CAGAAAGGTCAGGAGA7AAAAGAAATTAG7GACATGAGAGGTACTCCTCA 

GG7G77AGGAAAGAGGG7AGAGCAAACCAGG7777CCACCA7A7G77GGA 

T AGG GGGT CAAG7 AAA777 C7AC 77 AAAAA77 ACAAACAGGGGC7GGGC G 

CGG7GGC7CA7GCC7G7AA7CCCGCAC777GGGAGGC7GAGGAGGGCGGA 

7CACAAGG7CAAGAGA77GAGACCA7CC7GGCCAACACGG7GAAACCG7G 

7C7CCAC7AAAAA7ACAAAAA77AGC7GGGCA7GG7GG7GCGTGCC7TTA 

77CCCAGC7AC7CGGGAGGC7GAGGCAGGAGAA7CGC7TGAACC7GGGAG 

G7GGAGG77GCAG7GGGCCGAGA7CGCACCAC7GCAA7CCAGAGCGAGAC 

T G7G7CAAAAAAAAAAAAAAAAAGAAAA77CCAAACAGGA7GACCC7AAG 

CC7GCAGGAC77GGAGACA7C7AGG7GAC7GA7AC7CAG7CACAAAACAT 

AA77GG7 CACAGGC C7GATGAAA7GCACAGCAGACC77 CAGA7GGTA7GC 

AC7 C AAG7GA7A7 CCACAAG7 CCACC7AAAGAAA7G C7A7A77CAGACA7 

~7GGCA7CAA7C7C7A7CAAACAAAGA7AG7CCAAAGCAA7GGG77CCAA 

AAACAC777CC7AAGACAAAT7C7C7A7T7GC77T7AA7A7CAG7CA7CC 

CAGCC~77GGAA7AGAGGAGCAAA7GA7ACCAG7GG7ACCC7ACCACAA7 

GCACCAAGG7A77A7AC7CTCA7GC7CCA7777C7CCC7C7G7C7ACATC 

AC7AA7AAC7CA77GA777CTGG7GCAAGCCC7GC7GGGAGAAAAAG7C7 

AC7C77G7ACC77GGAGCAAG77GC7CAGAG7AGG7A7CGAGGA7AAAA7 

77GGAAAG7TAGAAAAGCTAT7AGAAGGAGA7CC7AGTAGT7GAAAACAC 

AGCC7GGCCAAG7CAA7GA7GC7A777CA7C7CCCCAGCC77GCA7G7CC 

A7AGC7AAGGAAGACAA777AGGC77GGGC7AGAGGA7GGGAAAGGGCAA 

AA77AC7GA7GCCACAGCCCAGAGAGG7A77C7AG7AATC7GAGGGTGAG 

GACCACA7ACC7GG77CAGGGACG7ACAG7G77GACAGC7G7GAG7GGA7 

GCC7GGAG77C7GGCGTG7CT7C7AGCACAATGATACCTGAGACTC7TGC 

A7CA77GGGAA7AA7AAAATGGGAG7GGA7AGA7A7GAAA7TA7GA7GGC 

AA7AAGCAA7CAGC7AATAGC77CA77GA7GGGACAGA77AAAGA7GGCT 

GCAAA7CC777GG7CCAGG7T7GGGA7A7AGGCAGCA777G7A77GGAA7 

3C7GA7AG7C7GAGGCCA7GAAAAG7CCACC7GCAGTAGTGG7AGGAGGA 

ACAAGCC7CAC777C77CAA7G7G7G7GAC7GC7G7C77GAT7CCC7GGG 

TGGCCAG77CCA77CG7G7GG77C777GG7CCAC77GAC7C7GGGG7GGC 

7C7G7GA7GGC77GACCAATACAATG7AG7GGAAATGA7GCTGTCA7CA7 

77CCAGCC7C7TCCAGCCTTAAGGAAC7GGCAACTTTTATTTC7G7CCC7 

TGGAA7AC77G77C77GCAACCCA7CCA7CA7ACAG7GAGAAA77C7AAG 

C7GCCCCA77AAGAGGCCCACATGG7GA7AAA7TGGGGTCT7ACA7ACAG 

CCC7AGC7G7GC7CCTAGCTGACAAACAGTAGCAACTTGTCACCAGGCGA 

G7GAACCACT7AGGACTGTATACTCCAGCCCCAGTTGAGCAATGTGGAAC 

AGAG7AAAC CATCTCAGCTTAGCCCTGCCCAAACTGCAGAA7TATGAGCA 

AAA7AA7CCCC7AGGC7TTGGGC7GA777GT7CCAGA7TACTGGAACAGA 

A777GG7ACCAGGGG7GAGG7GCTACAGCAA7GAAAGC7TAAGACACG7G 

AC777GGTTT7GGG7CTGAGTGGCAGGGGAACT7GGCAGGCC7CAAGGAA 

AC7777AGGGAGGGTTGAAGCA7AG7GAGGAAAACAG7AGGGGAAGC7AG 

AGGAAAAAA7GA7GCT7GGTATG7AG7GG7GGGAAGTTTAGCAAAAC7CG 

CTGA7G7AA7GTGGGAAAT7G7AAGAAC7CAGAACGATT7AAGGGCATG 

7777A7AGG7CC7TTAAGAAAC77C7AGGCCAGGCGCAG7GGCTCA7GTC 

7G7AA7CCCAGCAC7T7GGGAGGC7GAGG7GGGCGGA7CACAAGG7CAGG 

AGA7CGAGACAA7CCTGGC7AACA77G7GAAACCCCGTCTCTACTAAAAC 

7ACAAAAAAAAA7TAGCCGGGCA7GG7GGCGGG7GCC7GTAGTCCCAGC7 

AC7AGGGAGGC7GAGGCAGAAGAA7GGCG7GAAC C7GGGA7GTGGA7CTT 

GAAG7G AGC C CAGAT7G7GC CAC7GCAC7CCAGCC7GGGCAACAGAG7GA 

3 AC7 C CG7 C7CAAAC CGAAAAAAAAAAAAAAAAAAAGAAAC77C7AGGGC 
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TGGTCCCGTGGAAGCCTCACACATGGTACACAAAGGCTGTCTTGAAAAGA 
AACGTAAGTGTGTTTrTTGGTTTAATAAAATTGATTATAAATGGATAATG 
CAAAACATTTTAAAGAATTTTACTAGCTTACATTAGCAGATTTGGATCCA 

gtgattgttacattctggtactgagcccctgaattacttctttgagtaag 

gcattataccaaagctattgatagttgggcttatagggtgtatgtttgaa 

gaactactaatgtcaaaaccaatatttcacggtcgacaagaggacatcag 

aact gg taatc cttattaccatg actggctggacagaatactcaatgtaa 

tgggatttcctgcaaataaagacggggaagatgtaaaaaagatgcctgaa 

cattcaacattaatgaaagatttcagaagaaatatgtatactaactgcag 

ccttatcaagtatatggaaaaacacaaagttaaaccagatagtaaagcat 

tccactrgcttcagaagtttcttactatggacccaataaagtgaattacc 

tgagaacggggtccctgtttcttcgaagacccacttcctacatcagacgt 

tttcaacagttgtcaaatcccctacccaaaatgagaatttttaacagaag 

aagaacctgatgacaaaggagccaaaaagaaccaccaccggcagcagggc 

cataaccacacgaatggaactggccacccaggaatcaagacaacggtcac 

acacagggacccccgttgaagaaagtgaggcttgttcctcctaccactac 

ctcaggtggacttttcacggcctcagactatccgcgttccaatccacatg 

ctgcctatatcccaaccctggaccaagcacatcccagccgaagagcagtg 

taggatactcagctacctcccagcaggctccacaggacccacgtcagaca 

cacgggtactgagctgcatcggaatcttgtccgtgcactgttgtgaatgc 

tgcagggctgactgtgcagctctccgtgggaacctggtatgggccatgag 

aatgtactgtacaaccacacctgcccagtagccaagttccttccaccgct 

tttcacagatcggggtagtggcttccagtttgtacctattttggagttag 

acctgaaaagaaagcgctagcacagtttgtgttgtggatttgctactttc 

atagttaacttgacctggctcagactgaccagtacttttttttccgtgac 

agtctatagcagttgaagctgagaatgtgctaggggcaagcgtttgtctt 

catatgtcatgaattcctccagtgtaacaacattatctgaccaatagtac 

acacacagacacaaggtttaactggtacttgaaaacatacagtaggtgtt 

aactcagtgaaataaccaggactcaaagtaagattattttggtacacctt 

tcttgttagtgtcttatcagtgagttgattcattttctacattaatcagt 

gttttctgaccaagaatattgcttgga t ttttctgaaagtacaaaaagcc 

acatagttttt ttcag aaaggtttcaaaactcctaaagattaatttccaa 

GTATAAGTTTGTTTTTATTTTCAATCTATGACTTGACTGGTATTAAAGCT 

GCTATTTGATAGTAATTAGATATATTCTCATTGATATAAACCTGTTTGGT 

T CAGCAAACAAACTAAAATGATTGTCACAGACAATGCTTTATTTTTCCTG 

TTGGTGTTGCTTGTGGGAAAAAGAAAGAGAGATCAGATTGTTACTGTGTC 

TGTGTAGAAAGAAGTAGACATAGGAGACTCCATTTTGTTCTGTACTAAGA 

AAAATTCTTCTGCCTTGAGATGCTGTTAATCTATATAACCTTACCCCCAA 

CCCTGTGCTCTCTGAAACATGTGCTGTGTCCACTCAGGGTTAAATGGATT 

AAGGGCGGTGCAAGATGTGCTTTGTTAAACAGATGCTTGAAGGCAGCATG 

CTCGTAAGAGTCATCACCACTCCCTAATCTCAAGTACCCAGGGACACAAA 

CACTGCTGAAGGCCGCAGGGACCTCTGCCTAGGAAAGCCAGGTATTGTCC 

AAGGTTTCTCCCCATGTGATAGTCTGAAATATGGCCTCGTGGGAGGGGAA 

AGACCTGACCGTCCCCCAGCCCGACACCCGTAAAGGGTCTGTGCTGAGGA 

GGATTAGTATACGAGGAAGGAACGCCTCTTTGCAGTTGAGACAAGAGGAA 

GGCATCTGTCTTCTGCCCGTCCCTGGGCAATGGAATGTCTCGGTATAAAA 

CCCGATTTTATGTTCCATCTACTGAGATAC3GGGAAAACCACCTTAGGGCT 

GGAGGTGGGACATGCGGCAGCAATACTGCTCTTTAAGACATTGAGATGTT 

TATGTGTATGCATATCTAAAGCACAGCACTTAATTCTTTACCTTGTCTAT 

GTTGCAGAGACCTTTGTTCACGTGTTTATCTGCTGACCTTCTCTCCACTA 

TTATCCTATGACCCTGCCACATCCCCCTCTCCGAGAAACACCCAAGAATG 

ATCAATAAATACTAAGGGAACTCAGAGGCCGGCGGGATCCTCCATATACT 

GAACGCTTGTCCCCTGGGCCCCCTTATTTCTTTCTCTATACTTGGTCTCT 

GTGTCTTTTTCTTTTCCAAGTCTCTCGTTCCACCTAATGAGAAACACCCA 

CAGGTGTAAAGGGGCAACCCACCCCTTCATTGCTGATTTGTGAGCGTGCT 

TTAAGGTGAAAAAAGCATGAATGTTAACTTCCTTAAAAAGGTACAGCATC 

CAATTCAAATATTTTTGTCCTGATTTTAATGCTAGTTGATGTAGTGCTAT 

TAAAATTTTGTTCAACATGGACACAGAGAGGGGAACAACACATACCAGGG 

CCTGTTGCGGGGTGGGGATGAGGGGAGGGAACTTAGAGGACAGGTGAACA 

GGTGCAGCAGATCACCATGGCCCACATATACCTATTTAACAAACCTGCAC 
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GTTCTGCACACGTATCCCATTTCTTTTTTTTTTTAAGAAATAGAAAAAAA 
AATAAAAT7TTGTTCACTGATTCTTC CATTTTAAAACTTGTTTGCATGTG 
GTTTAGGATGC C CTTACTTCAGCAAAGGAGAAGGAATAGGAGGGCCTTAG 
AATTTTTGAGGGAAAAAAACCCTATAACATACATTGTACTGTATCAAACr 
ATTTTACATGAATGACACAAGTATTCTGAATAAAAAAATAATTGAACATT 
GTTAAGAACAAGGTGTCATGTAATTTATTTTTCATAAATAAAAAAATTAT 
AGXGGCTTAGACTGAAAGGAACAGAGAATTTAAAAAATTAAAAAGAAGCC 
TTAGTATATTTTTGTATATAGTTTCCATGTGCCATATTTGCCATAATTGG 
ATGAGAATTTTTTGACCTCTGGCAGGGTGACCCTATATTTTCANTNTATA 
AAGCGTGCATCATACC 
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